Geeky post again – no math this time, but computer code.
I’m sure people have done this before, but I thought it would be a nice opportunity to practice my Python skills to write a small script for the following problem.. Usually when I read a scientific article I watch out for the following elements:
- Innovation: what does the study do what others haven’t done before?
- Method: what method did they use?
- Data: where did they get their data from?
- Results: what are the main results?
- Relevance: who benefits from this research, and how?
I also like to place the research in one of the four quadrants in this post. I find it helpful to make an overview of these questions in a table:
1st author |
Year |
Journal |
Quadrant |
Innovation |
Method |
Data |
Results |
Relevance |
Kompas |
2005 |
Journal of Productivity Analysis |
4 |
Estimates efficiency gains quota trade for Southeast Trawl Fishery, AU |
Stochastic frontier analysis |
AFMA and ABARE survey data |
ITQs gave efficiency gains |
Policy debate on ITQs |
Kompas |
2006 |
Pacific Economic Bulletin |
3 |
Estimates optimal effort levels and allocation across species |
Multifleet, multispecies, multiregion bioeconomic model |
SPC data |
Effort reduction needed; optimal stocks larger than BMSY |
Policy debate on MEY |
But here’s the problem: I usually make my notes in a bibtex file (as a good geek should), which looks like this:
@ARTICLE{Kompas2006PacEconBull,
author = {Kompas, T. and Che, T.N.},
title = {Economic profit and optimal effort in the Western and Central Pacific tuna fisheries},
journal = {Pacific Economic Bulletin},
year = {2006},
volume = {21},
pages = {46-62},
number = {3},
data = {SPC data},
innovation = {Estimates optimal effort levels and allocation across species},
quadrant = {3},
keywords = {tuna; bioeconomic model; optimisation; Pacific},
method = {Multifleet, multispecies, multiregion bioeconomic model},
results = {Effort reduction needed; optimal stocks larger than BMSY},
relevance = {Policy debate on MEY}
}
@ARTICLE{Kompas2005JProdAnalysis,
author = {Kompas, Tom and Che, Tuong Nhu},
title = {Efficiency gains and cost reductions from individual transferable quotas: A stochastic cost frontier for the Australian South East fishery},
journal = {Journal of Productivity Analysis},
year = {2005},
volume = {23},
pages = {285-307},
number = {3},
quadrant = {3},
data = {AFMA and ABARE survey data},
innovation = {Estimates efficiency gains quota trade for Southeast Trawl Fishery, AU},
keywords = {individual transferable quotas; stochastic cost frontier; fishery efficiency; ITQs},
method = {Stochastic frontier analysis},
relevance = {Policy debate on ITQs.},
results = {ITQs gave efficiency gains}
}
I don’t want to copy it all by hand, so I wrote this little script in Python to convert all entries in the bibtex file to a csv file:
import csv
from bibtexparser.bparser import BibTexParser
from dicttoxml import dicttoxml
from operator import itemgetter
def readFirstAuthor(inpList,num):
author1 = ""
x = inpList[num]['author']
for j in x:
if j != ',':
author1+=j
else:
break
return author1
def selectDict(inpList,name):
outObj = []
for i in range(len(inpList)):
if name in inpList[i]['author'] and \
inpList[i]['type']=='article':
outObj.append(inpList[i])
return(outObj)
def selectFieldsDict(inpList,fieldNames):
outObj = []
for i in range(len(inpList)):
temp = {}
for n in fieldNames:
if n == 'author':
author1 = readFirstAuthor(inpList,i)
temp['author'] = author1
else:
if n in inpList[i]:
temp[n] = inpList[i][n]
else:
temp[n] = 'blank'
outObj.append(temp)
return(outObj)
fieldnames = ['author','year','journal','quadrant',\
'innovation','method','data','results','relevance']
with open('BibTexFile.bib', 'r') as bibfile:
bp = BibTexParser(bibfile)
record_list = bp.get_entry_list()
record_dict = bp.get_entry_dict()
dictSelection = selectDict(record_list,'Kompas')
fieldSelection = selectFieldsDict(dictSelection,fieldnames)
test = sorted(fieldSelection, key=itemgetter('year'))
test_file = open('output.csv','wb')
csvwriter = csv.DictWriter(test_file, delimiter=',',\
fieldnames=fieldnames)
csvwriter.writerow(dict((fn,fn) for fn in fieldnames))
for row in test:
csvwriter.writerow(row)
test_file.close()
If you are a Python developer: any comments on this are welcome. I’m sure it’s not perfect.