Recently, I was asked to help with some side optimization project at our company, I've made some good research. I'm still not 100% sure if this is most efficient way to do this.
Problem:
- Scraping for over a dozen different information from a internal system ( website ) and passing them into Microsoft Office document template.
Restrictions:
- Website is working only in IE 9
- System does not have any API / web services
- This will be used at more than 100 different workstations
- On workstations, there is only IE 9, no FF or Chrome allowed
- Getting acceptance for installation any software except default Windows tools on workstations is almost impossible
We have made a small working proof of concept for this. It is using Visual Basic + javascript combo. Short description: Visual Basic opens instance of IE, then using javascript we are able to log into system, go to tabs that we need and find proper information, then we are able to push this data into the Office template.
It is working, but I'm not really sure that this approach is the best one.
We have tried some different solutions, nodejs server, Selenium, some other web scrapers, but they all seem to have some limitations.