We split into two groups:
bastian@phix ~/eth_hackdays/plink_binaries ᐅ cat *.log|grep Inferred|sort|uniq -c 578 Inferred sex: female. 845 Inferred sex: male.
Total genotyping rate is 0.482581. 1626942 variants and 1423 people pass filters and QC. Note: No phenotypes present
So far for the data processing, now we can go for the PCA itself.
smartpca.perl -i merged-23andme-opensnp-filtered-noLD.bed -a merged-23andme-opensnp-filtered-noLD.bim -b merged-23andme-opensnp-filtered-noLD.fam -l test.log -o test.pca -p test.plot -e test.eigen -i / -a / -b are all fed with the corresponding bed/bim/fam files of PLINK. -l -o -p -e are the output files for smartpca. The -o gives just the prefix.
With that done we can use the test.pca.evec which was generated for plotting with R & ggplot2.
library(ggplot2) d = read.table("test.pca.evec", as.is = T) ggplot(d,aes(x=d[,2],y=d[,3],color=d[,4])) + geom_point() + scale_x_continuous("PC1") + scale_y_continuous("PC2") + scale_color_continuous("PC3")
This should yield the upper of the two plots linked below. In an even more quick & dirty approach (and totally undocumented) I associated the genotyping files with the phenotypes given by the openSNP users, here the eye color.