r - Writing a loop to extract Comtrade data and export them into multiple csv files -
i writing loop extract un comtrade export data each country (say country i) , export them individual csv files country later use. used "get.comtrade" function (written stefan a.) starting point , looped through each unique 3-digit country code in comtrade data. plus, interested in obtaining "commodity" "export" data (from country other countries) , have no use other information contained in "list" extracted using get.comtrade function, extract "year", "reporter", "partner", , "trade value" data each country , assemble them csv table.
to illustrate example, begin usa (country code=842) in year 2006, wrote
d <- get.comtrade(r="842", p="all", ps="2006", type="c", rg="2", freq="a")
which gives me "list" --> d contains 2 items "validation" , "data" in latter (data) contains 35 variables , data frame of interest. however, need following information (as index column number): [2]year, [10]rttitle=reporter, [13]pttitle=partner, [32]trade.value..us..
entered
df <- cbind(d$data[2], d$data[10], d$data[13], d$data[32])
to make smaller data frame , adjust "trade value" million usd.
df$million <- as.numeric(as.character(df[,4]))/1000000
finally, export csv file designated working directory
write.table(df, "~/working directory/data.csv", na = "na", row.names = false, col.names = true, sep=",")
this straightforward , indeed gives me four-column csv table. building on simple format, went on write loop obtain csv output each country:
(i in 1:length(country_list)){ d <- get.comtrade(r="i", p="all", ps="2006", type="c", rg="2", freq="a") df <- cbind(d$data[2], d$data[10], d$data[13], d$data[32]) df$million <- as.numeric(as.character(df[,4]))/1000000 myfile <- file.path("~/working directory", paste0("_", i, ".csv")) write.table(df, file=myfile, na = "na", row.names = false, col.names = false, quote=false, append=false, sep="") }
but returns error message saying that
error in [.data.frame(d$data, 2) : undefined columns selected
indicating error occurs on second line of loop, is, columns in d$data
undefined
. checked previous questions on board pertaining column name question in list and/or data frame format, trying different methods literally 3 days still not being able figure out solution problem.
i understand supposed simple loop bogged down.
what's guys' take on this? appreciated if point out errors in "undefined column" issue since pops time in number of earlier threads i've checked far.
thank you.
but way,
in order execute code above, 1 need stefan a.'s get.comtrade code:
get.comtrade <- function(url="http://comtrade.un.org/api/get?" ,maxrec=50000 ,type="c" ,freq="a" ,px="hs" ,ps="now" ,r ,p ,rg="all" ,cc="total" ,fmt="json" ) { string<- paste(url ,"max=",maxrec,"&" #maximum no. of records returned ,"type=",type,"&" #type of trade (c=commodities) ,"freq=",freq,"&" #frequency ,"px=",px,"&" #classification ,"ps=",ps,"&" #time period ,"r=",r,"&" #reporting area ,"p=",p,"&" #partner country ,"rg=",rg,"&" #trade flow ,"cc=",cc,"&" #classification code ,"fmt=",fmt #format ,sep = "" ) if(fmt == "csv") { raw.data<- read.csv(string,header=true) return(list(validation=null, data=raw.data)) } else { if(fmt == "json" ) { raw.data<- fromjson(file=string) data<- raw.data$dataset validation<- unlist(raw.data$validation, recursive=true) ndata<- null if(length(data)> 0) { var.names<- names(data[[1]]) data<- as.data.frame(t( sapply(data,rbind))) ndata<- null for(i in 1:ncol(data)){ data[sapply(data[,i],is.null),i]<- na ndata<- cbind(ndata, unlist(data[,i])) } ndata<- as.data.frame(ndata) colnames(ndata)<- var.names } return(list(validation=validation,data =ndata)) } } }
and list of reporters from
library(rjson) string <- "http://comtrade.un.org/data/cache/partnerareas.json" reporters <- fromjson(file=string) reporters <- as.data.frame(t(sapply(reporters$results,rbind)))
Comments
Post a Comment