rsID分别代表两个变异位点

2019-12-05 04:19栏目:常见疾病

两种方法怎么着拿到snp音信

引用: http://www.bio-info-trainee.com/2100.html#more-2100

有色金属切磋所究注脚STAT4上的rs7574865和HLA-DQ的 rs9275319是人工产后出血中乙型胆囊癌病毒(HBV)相关肝细胞癌(HCC)遗传易感基因

意思是,某三个位点变异招致乙型肝硬化病毒和相关肝细胞瘤发生的显要原因。rsID分别代表多少个变异位点 (开掘产生位点后透过vep/snpEFF对产生位点进行的笺注)。所以听新闻说rsID能够找到那个位点在基因组的职位。可以用dnSNP来查看rsID的基因坐标。

方法一:
下载All_20150601.vcf.gz 这么些文件(非常的大数目):

mkdir -p ~/annotation/variation/human/dbSNP
cd ~/annotation/variation/human/dbSNP
## https://www.ncbi.nlm.nih.gov/projects/SNP/
## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh38p2/
## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/
nohup wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz &
wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz.tbi

运作的时候有报错:No such directory ‘snp/organisms/human_9606_b147_GRCh37p13/VCF’.

方法二:
也能够登陆网页版本数据库,直接改良 url(一些些搜索):
https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=7574865
https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs9275319

方法三:
SNPedia,直接改过url (优点,搜罗了比较多的其余数据库的链接)
https://www.snpedia.com/index.php/Rs7574865
https://www.snpedia.com/index.php/Rs9275319


  1. window=100k,step=2k 总结每一个window的snp密度,然后用mixtools的normalmixEM(五个组分的交集模型)计算snp的布满形式。

开展:怎样开展GWAS深入分析

方法一:
plink实行剖析
这里是plink的官网:rsID分别代表两个变异位点。https://www.cog-genomics.org/plink2/
plink做SNP筛选和GWAS
plink进行GWAS分析

方法二:
奥迪Q5包深入分析 (绘制曼哈顿图)
Postgwas: Advanced GWAS Interpretation in R
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0071775


R command:

如何call SNP and indels

参考: http://blog.sina.com.cn/s/blog_83f77c940102w2eb.html


library(mixtools)

如何SNP过滤

引用: http://blog.sina.com.cn/s/blog_83f77c940102w2eg.html

  1. 缺点和失误比例 (Missing rates)
    GENO>0.05

Shortly we will apply more stringent criteria, such that GENO > 0.05. In this case, 0.05*89 = 4.45 samples, meaning that if a SNP is missing in 4.45 more more samples, that SNP will be removed from the dataset.

89是全部sample数,89xGENO获得的阀值是4.45,所以有些call的SNP在4样本(或以下)里从未现身,保留;在5个样品以上没现身则删掉。

  1. 细微等位基因频率 (Minor Allele frequencies)
    提醒: MAF< 0.03 假若SNP相当多能够设置为MAF<0.05

MAF is the Minor Allele Frequency. It can be used to exclude SNPs which are not informative because they show little variation in the sample set being analyzed. For instance, if a SNP shows variation in only 1 of the 89 individuals, it is not useful statistically and should be removed.

野趣是,假设某三个SNP只现出在超级少数样板(< MAF x Total Number of samples卡塔尔国的时候,就须要移除

  1. Removing SNPs out of 哈代-Weinberg equilibrium(p-value > 10^6 - 10^4 卡塔尔(英语:State of Qatar) 哈帝温Berg平衡

Population genetic theory suggests that under ‘normal’ conditions, there is a predictable relationship between allele frequencies and genotype frequencies. In cases where the genotype distribution is different from what one would expect based on the allele frequencies, one potential explanation for this is genotyping error. Natural selection is another explanation. For this reason, we typically check for deviation from Hardy-Weinberg equilibrium in the controls for a case- control study. For a quantitative trait, PLINK just uses everyone. The following command generates p-values for deviation from HWE for each SNP. Low p-values indicate that a SNP is out of HWE.

  1. 由vcf文件举行SNP过滤
    接受vcftools转变为plink的输入格局,输出 bed文件 (也许map文件),然后作为输入举办过滤
vcftools --vcf my.vcf --plink --out plink

plink --noweb --file plink --geno 0.05 --maf 0.05 --hwe 0.0001 --make-bed --out QC

SNPdensity=read.table("snp.density.file")

设若还不清楚哪些是GWAS?什么是SNP?这里是概念:

引用: http://www.biotrainee.com:8080/thread-1487-1-1.html
Genome-wide association studies (GWAS卡塔尔(英语:State of Qatar)是指在人類全基因組範圍內利用存在的类别變異,即單核苷酸多型性(SNP卡塔尔(英语:State of Qatar),並從中篩選出與病魔相關的SNPs。

  • 如何病魔与SNP有关呢?
    近期,全基因组关联深入分析方法(Genome-Wide Association Study,简单的称呼GWAS)利用大群众体育和高密度SNP(Single Nucleotide Polymorphism,单核苷酸多态)分子标识已经固化到了上千个与复杂病痛关联的SNP位点,何况这个涉嫌实信号在频繁试验中有极高的可重复性。比方人类何奇之有病魔痴肥,前驱糖尿病,精气神差别等。
  • SNP的标称误差因素?
    由于自由采集样板带过来衡量固有误差(这在现实中不能制止)以致SNP之间复杂的连带不平衡(linkage disequilibrium, 简单的称呼LD),GWAS定位到的SNP位点平日不是生病位点。

二〇一五年刊出在PLOS-one上的稿子,介绍SNP与骨带下。
就算如此不是很牛的杂志,可是随笔品质很好。

mixmdl=normalmixEM(SNPdensity)

Functional Characterization of the Osteoarthritis Susceptibility Mapping to CHST11—A Bioinformatics and Molecular Study

听闻标题能够通晓,是对Osteoarthritis病魔的钻研,针对的指标基因是CHST11,Carbohydrate sulfotransferase 11 is an enzyme that in humans is encoded by the CHST11 糖-磺基转移酶 (不知器具体翻译,请(生)化学大神指教)。基因地点 是 chr12: 104,455,295-104,762,014 (GRCh38卡塔尔国。CHST11的意义商讨,United Kingdom浦项科技的桑格商讨全体做过该基因敲除的小鼠,Chst11^tm1a(KOMP卡塔尔(英语:State of Qatar)Wtsi 。那些基因主要与骨头和软骨的表型phenotyping有涉及。小鼠的表型切磋里开采卓殊:Homozygous viability at P14。

二零一一年柳叶刀里也可能有成文说这些基因突变会导致,骨水肿,这几个杂志就无须说有多厉害了。

plot(mixmdl,which=2)

Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association

接下去分别看一下这两篇小说,和这几个基因,以至那个基因的SNP,以致对其功用解析上的钻研与阐释。

mixmdl$mu  mixmdl$sigma  mixmdl$lambda

(豆蔻梢头卡塔尔国 骨水肿的背景:

 

什么是OA?

(1)Osteoarthritis (OA) is a common disease of older individuals that is characterized by the focal(病灶点) loss of articular cartilage. This loss usually occurs gradually over many years and typically results in chronic pain and severely impaired joint function by the sixth or seventh decade of life.

(2)Osteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people.

genetics上OA的特点?

(1)OA is polygenic and unlike many other common arthritic diseases, there are no OA risk- conferring loci of large singular impact
(2)It is a complex disease of the musculoskeletal system with both genetic and environmental risk factors. From the results of heritability studies in twins, sibling pairs, and families, genetic factors are estimated to account for about 50% of the risk of developing osteoarthritis in the hip or knee, although precise estimates vary according to sex, affected site, and severity of disease.

(二)切磋措施:

(1)偏重功用解析

  • Identification of SNPs in LD with rs835487
  • Identification of Sequences Homologous to the Enhancer in Non-Human Mammals
  • Cloning of pGL3-Promoter Luciferase Reporter Plasmids
  • Transfection of Cell Lines
  • Electrophoretic Mobility Shift Assays (EMSAs)
  • Ethics Statement, Cartilage Collection and Nucleic Acid Extraction
  • Gene Expression, Genotyping and AEI Analysis
  • Chondrogenic Differentiation of MSCs

(2)偏重分析

  • We undertook a large genome-wide association study (GWAS) in 7,410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11,009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7,473 cases and 42,938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent.

(三)结论

(1)rs835487 (allele G; THR) located within intron two of CHST11 is associated with hip OA

版权声明:本文由银河国际官方网站发布于常见疾病,转载请注明出处:rsID分别代表两个变异位点