Newer
Older
```python
git clone https://gitlab.eecs.umich.edu/junhao/ioe437scraper
```
* before running, make sure have FireFox, python2.7, jupyter notebook and pip on local computer, also install selenium, pandas, bs4 by typing this in terminal:
```python
pip install selenium
pip install pandas
pip install bs4
```
* generate two Pandas DataFrame pickle file for further analysis of journals by typing this in terminal.
* generate view for pandas DataFrame by typing this in python interpreter / jupyter notebook
pandas.read_pickle('AAPdata') # get view for Accident Analysis and Prevention data
pandas.read_pickle('TRPdata') # get view for Transportation Research Part F data
* generate csv files from pandas DataFrame by typing this in python interpreter / jupyter notebook
import pandas
pandas.read_pickle('AAPdata').to_csv('AAP.csv',encoding='utf-8') # create csv file for Accident Analysis and Prevention data
pandas.read_pickle('TRPdata').to_csv('AAP.csv',encoding='utf-8') # create csv file for Transportation Research Part F data
* AAP.csv and TRP.csv already partially tagged with topics, with related pdf file links for further reading
* Develop API to turn class literature review / research into one-click using Data Mining and Machine Learning
* Apply word2vec model on paper title for classification / clustering of topics
* Run data mining on author names to link research topics with author university / nationality
* Create visualization for how topics change over years / nationalities / universities
* Create predictive models on future topics from possible inputs
* Predict "the next big thing" research topic and related research-rich university based on different countries