r - dplyr : how to read a tsv file with headers while skipping some lines? -

- January 15, 2012

this question has answer here:

read.csv, header on first line, skip second line 1 answer

i have simple tsv file following structure:

0 - headerline 1 - empty line 2 - pig schema 3 - empty line 4 - 1-st line of data 5 - 2-nd line of data

i read it, possibly using readr::read_tsv here problem.

as can see, first row contains headers. have 3 rows not want read (they contains super weird data coming apache pig), , @ row 4 data starts. in pandas, like

df = pd.read_csv('/localpath/data.tsv', sep='\t', skiprows=[1,2,3])

which allows me read headers and skip row one, two, three.

i don't see similar option in readr::read_tsv. :

df = read_tsv('/localpath/data.tsv', col_names = true, skip = 4)

which not parse headers...

any ideas?

posting comment answer. basically, read in first row our header, , read in remaining rows data:

library(readr) names_t <- read_tsv('/localpath/data.tsv', col_names = false, n_max = 1) df1 <- read_tsv('/localpath/data.tsv', col_names = false, skip = 4) names(df1) <- names_t

note in comment specified nrows = 1 read in names (this work read.csv), appears argument replaced n_max in readr::read_tsv.

Search This Blog

QR

r - dplyr : how to read a tsv file with headers while skipping some lines? -

Comments

Post a Comment

Popular posts from this blog

java - .class files under target/classes folder Maven -

linux - Could not find a package configuration file provided by "Qt5Svg" -

simple.odata.client - Simple OData Client Unlink -