text processing - linear table to matrix format -


i convert linear table matrix format.

my input table looks , called "linear_table.tab":

 transcript      ortho  transcript_1    ortho_1 transcript_2    ortho_2 transcript_3    ortho_3 transcript_4    ortho_4 transcript_5    ortho_5 transcript_6    ortho_6 transcript_7    ortho_5 transcript_8    ortho_1 transcript_9    ortho_4 transcript_10   ortho_5 transcript_11   ortho_2 transcript_12   ortho_7 transcript_13   ortho_8 transcript_14   ortho_5 transcript_15   ortho_2 transcript_16   ortho_9 

what matrix table like:

                           transcript_1 transcript_2    transcript_3    transcript_4    transcript_5    transcript_6    transcript_7    transcript_8    transcript_9    transcript_10   transcript_11   transcript_12   transcript_13   transcript_14   transcript_15   transcript_16                 transcript_1    0   0   0   0   0   0   0   1   0   0   0   0   0   0   0   0                 transcript_2    0   0   0   0   0   0   0   0   0   0   1   0   0   0   1   0                 transcript_3    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_4    0   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0                 transcript_5    0   0   0   0   0   0   1   0   0   1   0   0   0   1   0   0                 transcript_6    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_7    0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0                 transcript_8    1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_9    0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_10   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0                 transcript_11   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_12   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_13   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_14   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0                 transcript_15   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0                 transcript_16   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 

here code using r:

  linear.table <- read.table("linear_table.tab", header=t, sep="\t")   library(reshape2)   dcast(linear.table, transcript~ortho, fill=0) 

i following output in r:

              transcript ortho_1 ortho_2 ortho_3 ortho_4 ortho_5 ortho_6 ortho_7 ortho_8 ortho_9         transcript_1 ortho_1       0       0       0       0       0       0       0       0         transcript_10       0       0       0       0 ortho_5       0       0       0       0         transcript_11       0 ortho_2       0       0       0       0       0       0       0         transcript_12       0       0       0       0       0       0 ortho_7       0       0         transcript_13       0       0       0       0       0       0       0 ortho_8       0         transcript_14       0       0       0       0 ortho_5       0       0       0       0         transcript_15       0 ortho_2       0       0       0       0       0       0       0         transcript_16       0       0       0       0       0       0       0       0 ortho_9         transcript_2       0 ortho_2       0       0       0       0       0       0       0         transcript_3       0       0 ortho_3       0       0       0       0       0       0         transcript_4       0       0       0 ortho_4       0       0       0       0       0         transcript_5       0       0       0       0 ortho_5       0       0       0       0         transcript_6       0       0       0       0       0 ortho_6       0       0       0         transcript_7       0       0       0       0 ortho_5       0       0       0       0         transcript_8 ortho_1       0       0       0       0       0       0       0       0         transcript_9       0       0       0 ortho_4       0       0       0       0       0 

i not sure how proceed in aspect using r.

using awk:

$ cat ortho.awk nr > 1 {   transcript = $1;   ortho = $2;   = transcript;   j = ortho;   sub("transcript_", "", i);   sub("ortho_", "", j);   imx[i][j] = 1; } end {   (i in imx) {     (j in imx) {       omx["transcript_"+i]["transcript_"+j] = imx[i][j] == "" ? 0 : 1;     }   }    printf("\t");   (i in omx) {     printf "\ttranscript%d", i;   }   print "";    (i in omx) {     printf "transcript%d", i;     (j in omx) {       printf "\t%d", omx[i][j];     }     print "";   }  } 

idea populate sparse matrix of 1's, @ end fill 0's in missing spots. print out.


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -