README.md

## Usage:

* 	download file typing this in terminal

	```python
	git clone https://gitlab.eecs.umich.edu/junhao/ioe437scraper
	```

*	before running, make sure have FireFox, python2.7, jupyter notebook and pip on local computer, also install selenium, pandas, bs4 by typing this in terminal:
	
	```python
	pip install selenium
	pip install pandas
	pip install bs4
	```

* 	generate two Pandas DataFrame pickle file for further analysis of journals by typing this in terminal.
* 	IMPORTANT: enter unique name and password when the program prompts so.
	
	```python
	python parsejournal.py 
	```

* generate view for pandas DataFrame by typing this in python interpreter / jupyter notebook 
   
   ```python
   import pandas

   pandas.read_pickle('AAPdata') # get view for Accident Analysis and Prevention data
   
   pandas.read_pickle('TRPdata') # get view for Transportation Research Part F data
   ```

* generate csv files from pandas DataFrame by typing this in python interpreter / jupyter notebook
	
	```python
	import pandas
	pandas.read_pickle('AAPdata').to_csv('AAP.csv',encoding='utf-8') # create csv file for Accident Analysis and Prevention data

	pandas.read_pickle('TRPdata').to_csv('AAP.csv',encoding='utf-8') # create csv file for Transportation Research Part F data
	```

## Content:

* AAP.csv and TRP.csv already partially tagged with topics, with related pdf file links for further reading

## Future Development:

* Develop API to turn class literature review / research into one-click using Data Mining and Machine Learning

* Apply word2vec model on paper title for classification / clustering of topics

* Run data mining on author names to link research topics with author university / nationality

* Create visualization for how topics change over years / nationalities / universities

* Create predictive models on future topics from possible inputs

* Predict "the next big thing" research topic and related research-rich university based on different countries


## Interested?

* Please contact junhao@umich.edu