Some useful R codes
This is an Attic to store some useful R codes, I don’t want to search them every time, so I put these codes at here.
Check packages if they are installed
check.package <- function(pkg){
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
if (length(new.pkg))
install.packages(new.pkg, dependencies = TRUE)
sapply(pkg, require, character.only = TRUE)
}
packages.name <- c("reshape","ggplot2","gridExtra")
check.package (packages.name)
Draw beautiful arrowhead
require(ggplot2)
require(grid)
d = seals[sample(1:nrow(seals), 100),]
ggplot(d, aes(x = long, y = lat)) +
geom_segment(aes(xend = long + delta_long/100,
yend = lat + delta_lat/100),
arrow = arrow(angle = 10,type="closed"),
colour=c("black"),
arrow.fill=c("red"),
size=0.5)
TBC
How to convert webm to gif on Linux
webm is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project WebP for images. On Fedora system, you could use the Gnome’s embedded screencast tool to create a 30 seconds video of your screen by default. What I want is a gif animation that would be easier to be transferred and shown online.
So how to convert webm file to gif file on Linux?
The best way is using ffmpeg 1.
ffmpeg -y -i input.webm -vf palettegen palette.png
ffmpeg -y -i input.webm -i palette.png -filter_complex paletteuse -r 10 output.gif
After that I also recommend using GNU Image Manipulation Program (GIMP) to crop off the unwanted part in the animation, it is also a way to reduce the animation size.
working with VCF files
- 1.sort VCF files
- 2.index VCF files
- 3.extract vcf from a bed region
- 4.intersect VCF files
- 5.merge VCF files
- 6.Concatenate or combine or append VCF files
- 7.Change Chromosome Notation
This is a note of working with VCF files. Try to avoid use awk/sed and other linux default command. Use the professional tools!
1.sort VCF files
java -jar picard.jar SortVcf INPUT=in.vcf OUTPUT=out.vcf
java -jar picard.jar SortVcf INPUT=in.vcf.gz OUTPUT=out.vcf.gz
It will provide index file if you set the output as gz file.
2.index VCF files
bgzip genotypes.vcf && tabix -p vcf genotypes.vcf.gz
Ref:Question: Generate vcf.gz file and its index file vcf.gz.tbi
3.extract vcf from a bed region
Need have a vcf index file before extract from the vcf file.
Print header line with -h.
tabix -R region.bed myvcf.gz -h > extract.vcf
Ref:Question: Extract Sub-Set Of Regions From Vcf File
4.intersect VCF files
bcftools isec -p output_dir A.vcf.gz B.vcf.gz
output_dir/0000.vcf.gzwould be variants unique to A.vcf.gzoutput_dir/0001.vcf.gzwould be variants unique to B.vcf.gzoutput_dir/0002.vcf.gzwould be variants shared by A.vcf.gz and B.vcf.gz as represented in A.vcf.gzoutput_dir/0002.vcf.gzwould be variants shared by A.vcf.gz and B.vcf.gz as represented in B.vcf.gz
Ref: Question: intersect VCF files
5.merge VCF files
When we say merge VCF files, it means that all genotype of one snv/indel will be merged in single line.
A.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3
10 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
11 . C A . . DP=3;CALLER=Samtools GT . . 1/1
12 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
13 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1
B.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S4 S5 S6
10 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
11 . C A . . DP=3;CALLER=Samtools GT 0/1 . 1/1
12 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 1/1
13 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
merged.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3 S4 S5 S6
10 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1 0/1 0/0 0/1
11 . C A . . DP=3;CALLER=Samtools GT . . 1/1 0/1 . 1/1
12 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0 0/0 0/0 1/1
13 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1 0/0 0/0 0/0
bcftools only accept zipped vcf files. -Oz is the output format of .gz
bcftools merge *vcf.gz -Oz -o Merged.vcf.gz
Ref:Question: Best way to merge multiple VCF files
6.Concatenate or combine or append VCF files
All source files must have the same sample columns appearing in the same order. The program can be used, for example, to concatenate chromosome VCFs into one VCF, or combine a SNP VCF and an indel VCF into one. The input files must be sorted by chr and position.
A.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3
10 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
11 . C A . . DP=3;CALLER=Samtools GT . . 1/1
12 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
13 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1
B.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3
14 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
15 . C A . . DP=3;CALLER=Samtools GT . . 1/1
16 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
17 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1
concat.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3
10 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
11 . C A . . DP=3;CALLER=Samtools GT . . 1/1
12 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
13 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1
14 . C A . . DP=3;CALLER=Samtools GT 0/1 0/0 0/1
15 . C A . . DP=3;CALLER=Samtools GT . . 1/1
16 . C A . . DP=3;CALLER=Samtools GT 0/0 0/0 0/0
17 . C A . . DP=3;CALLER=Samtools GT 0/1 1/1 1/1
bcftools concat A.vcf.gz B.vcf.gz -Oz -o output.vcf.gz
Ref:Question: How to merge vcf files with different variants but same samples?
7.Change Chromosome Notation
Please remember that in the VCF header line there also have chromosome notaion, if you use awk/sed to change the notaion, you should also change it on the VCF header line.
echo "1 chr1" >> chr_name_conv.txt
echo "2 chr2" >> chr_name_conv.txt
bcftools annotate --rename-chrs chr_name_conv.txt original.vcf.gz | bgzip > rename.vcf.gz
Rising of Populism
Trump started to use the term “Chinese Virus” in twitter and briefings in the White House since 17 March. It really let me feel angry, he almost did nothing to help the Americans stop the spread of coronavirus except blame China. However, he stands for Americans, it implies most of Americans may agree with his view – The coronavirus is a “Chinese Virus”. Trump’s actions are all driven by domestic politics in preparation for the 2020 election. COVID-19 is a global public health event. The United States is facing a huge pandemic prevention challenge, and the pandemic is likely to develop into a public crisis. Trump has a very clear mind on this, in order to ensure that his re-election is not affected by the COVID-19 pandemic, he needs to shape a “story” for COVID-19. “Foreign virus” is such a “story”. Its theme is: the United States is being invaded by foreign virus, the virus is imported from “other countries”, threatening the health and safety of the American public. Americans are innocent victims. Trump’s role as president is to protect the United States.
Trump wants to “up the ante”, to make the stroy more vivid by throw out the “Chinese virus”. It clearly links the virus to China. And with China-threat theory, U.S.-China confrontation being the topic of U.S. politics for the last two years, cracking down and containing China is now politically correct in Washington. China was supposed to be the central topic of the Trump 2020 election.
The use of term “Chinese virus” is an attempt to link infectious diseases to anti-Chinese sentiment and anti-Chinese politics.

Trump’s story is that the COVID-19 is coming precisely from the biggest threat - China. Even though China did not “intentionally export” the virus, its “failure to disclose” and “poor management” led to its spread to the U.S, causing serious damage to the U.S.
In this way, people unconsciously shift their attention from their own epidemic prevention and government responsibility to China. They do not blame the government for its poor epidemic prevention, but to think that China exports viruses to the world and hurts the world. American politicians can find a scapegoat for their own failings.
I think all of the Trump supporters will agree on his view. I know from my former hospital that an American physician never goes abroad, this person believes U.S. is the best country and never visits any other place in the world (because other countries are not better than the U.S.). It is real ignorance.
The appeal of the populists has grown with mounting public discontent over the status quo. In the West, many people feel left behind by technological change, the global economy, and growing inequality. Horrific incidents of terrorism generate apprehension and fear. Some are uneasy with societies that have become more ethnically, religiously and racially diverse. There is an increasing sense that governments and the elite ignore public concerns. The Dangerous Rise of Populism
Nativism, xenophobia, racism are on the rise. We couldn’t avoid facing more and more conflict between China and the West. Nobody will win in the last.
In China, there are also lots of people support for Populism, some of them believe the COVID-19 is de novo synthetic viruses, some of them believe traditional medicine is better than modern medical medicine. They believe the person who says the truth but not consistent with the government voice is the spokesperson for the West. Personal attacks via social media are more and more common.
The ideology gap between the West and China will become larger than we could think.
The economic gap between the developed countries and developing countries will become larger than we could predict. By the way, do you see many artists from South East Asia or Latin American or Africa at the virtual concert series – One World: Together at Home? Do you think those people have their voice in the world?
How to set up Samba on CentOS
Samba is the standard Windows interoperability suite of programs for Linux and Unix.
People can browse a server directory like use Windows OS on their own computer.
It also could avoid the scp step if you want to download and open a text file in the server, yes, you could open any files on the server and edit it like on your own computer.
Here I will introduce the steps to set up Samba on CentOS in a home network. There may be a problem if someone wants to use this Samba function on Windows 10 systems with a server that support different version of the Samba protocol. My laptop OS is Fedora, and the server OS is CentOS.
- Install Samba on local computer and server, eg:
yum install samba samba-client samba-client-libs. - Start the protocol on server,
sudo systemctl start smb.service.- The
smbdservice provides file sharing and printing services and listens on TCP ports 139 and 445. At here, for personal reason, I didn’t want to start thenmbdserver, which provides NetBIOS over IP naming services to clients and listens on UDP port 137.
- The
- Set up firewall
firewall-cmd --list-services firewall-cmd --list-ports firewall-cmd --permanent --zone=public --add-service=samba - I want to use the
/home/yulijiadirectory via Samba, so the Samba configuration file (/etc/samba/smb.conf) could be:[homes] comment = Home Directories valid users = %S, %D%w%S browseable = Yes read only = No create mask = 0700 directory mask = 0700 inherit acls = YesOnce done, please restart Samba services,
systemctl restart smb.service - Connect from your local machine.
Because I use Fedora, so I could open the Nautilus to add a new
connect to serverlink (smb://192.168.0.111:139/), then connect to the server, select Registered User, enter your server username and password.


Then the files on the Samba server will be shown.
Note: the default port of Samba service is 139.
Reference: