Wikidata:ScienceSource project/NCBI2wikidata notebook

This page for the ScienceSource project tracks the outputs of the NCBI2wikidata tool.

Detailed analytics

edit

For a daily breakdown of metadata added, see the table now at Wikidata:ScienceSource project/NCBI2wikidata dashboard.

Trials

edit
health specialty (P1995)/JSON filename MeSH term listing (concise, fuller list in a hidden comment) Edit groups (QS additions, includes batch number) Reviews, before (any taggings as systematic review (Q1504425) noted in parentheses) Reviews, after Review finder query License notes Focus list
ingestion
otolaryngology (Q189553)
otolaryngologyfeed
"Auditory Diseases, Central","Benign Paroxysmal Positional Vertigo","Cholesteatoma","Cholesteatoma, Middle Ear","Cochlear Diseases","Ear Diseases",... [2]
[3]
0 15 Displayed in notes.[1] CC-BY anomalies[2] Items to focus list
Problematic date duplication Auditory distance perception in humans: a review of cues, development, neuronal bases, and effects of sensory loss. (Q30390906), Specific synaptopathies diversify brain responses and hearing disorders: you lose the gain from early life (Q30407365), microRNAs: the art of silencing in the ear (Q30461076), Neuroauditory toxicity of artemisinin combination therapies-have safety concerns been addressed? (Q33841142), Vestibular rehabilitation in benign paroxysmal positional vertigo: Reality or fiction? (Q39293391)
feedA.json [4], stopped at 29% (Q26796661)
Restart as [5], 1550216608634, stopped at 20.5% (Q59075268)
Restart as 1550220819175, stopped at 33.8% (Q19125117)
Restart as 1550223271635, from Q37042563, stopped at 52.1% (Q36989584)
Restart as 1550224867252, completed.
(LSHTM workshop demo) Schistosomiasis [6]
feedpulmonology (less) Pulmonology [7] stopped at Hypersensitivity pneumonitis (Q59212250) (163 edits, connection broke)
[8], done
[9] focus list import[3]
Among focus list papers, with pulmonology main subjects, removed licences Creative Commons Attribution-NoDerivs 4.0 International (Q36795408), Creative Commons Attribution-NonCommercial-NoDerivatives (Q6937225), Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (Q19125045), Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (Q24082749) in a query[4]

Ingest query[5]
producing feed file pulmonologyless.json
disease query[6] producing dictionary pulmonologydiseases.json
drug query[7] producing dictionary pulmonologydiseasesdrugs.json


97 HTML papers posted. Run terminated because of an Internet connection failure. Possible cleanup required, to 1 March 2019 1952.
gtfeedophthalmologyless [10] stopped at 26.7%
[11] stopped at 98%
[12] completed
[13] focus list import[8]
Ingest query and dictionary creation on the pattern of pulmonologyless
gtfeedhematologyless [14] completed
gtfeedurologyless [15] stopped at 27%
[16] terminated in error
[17] stopped at 35%
[18] completed
gtfeedpsychiatryless [19] halted
[20] completed
gfdermatology Omitting "Skin Diseases" [21] halted at 30.6%
[22] completed
[23] stopped at 34.1%
[24] stopped at 7.7%
[25] completed
[26] completed
[27] completed

Production runs

edit

Neurology

edit
QS code text file(s) Edit groups (QS additions, default 6K lines of code)
resultsneurology1 [28] stopped, [29] stopped, [30] stopped, [31] completed, to EOF.
resultsneurology2 [32] stopped, [33] stopped, [34] completed, to EOF.
resultsneurology3 [35] stopped, [36] completed
[37] completed
[38] completed
[39] stopped, [40] completed
[41] completed
[42] completed, EOF.
resultsneurology4 [43] completed
[44] completed, EOF.
resultsneurology5 Blank running, [45] stopped, [46] completed
[47] completed, EOF.
resultsneurology6 [48] completed
[49] completed
[50] completed, EOF.
resultsneurology7 [51] stopped, [52] completed
[53] completed, EOF.
resultsneurology8 [54] completed
[55] completed, EOF.
resultsneurology9 [56] completed
[57] stopped, [58] completed
[59] completed
[60] completed, EOF
resultsneurology10 [61] completed
[62] stopped, [63] stopped, [64] completed
[65] stopped, [66] stopped, [67] stopped, [68] stopped, [69] completed
[70] stopped, [71] completed
[72] completed
[73] stopped, [74] completed
[75] stopped, [76] completed
[77] stopped, [78] completed
[79] completed
[80] completed, EOF.
resultsneurology11 [81] completed
Blank running, [82] completed
[83] stopped, [84] to EOF.
resultsneurology12 [85] stopped, blank running to EOF.
resultsneurology13 [86] completed
[87] completed
[88] stopped, blank running to EOF.
resultsneurology14 Blank running to EOF.

Medical genetics

edit
QS code text file(s) Edit groups (QS additions, default 6K lines of code)
resultsmedicalgenetics1
resultsmedicalgenetics2
[89] completed
resultsmedicalgenetics3
resultsmedicalgenetics4
[90] completed
resultsmedicalgenetics5 [91] completed
[92] stopped, [93] stopped, [94] completed
[95] completed, to EOF.
resultsmedicalgenetics6 [96] completed, to EOF.
resultsmedicalgenetics7 [97] stopped, [98] completed.
[99] completed, to EOF.
resultsmedicalgenetics8 [100] completed, to EOF.
resultsmedicalgenetics9 [101] stopped, [102] stopped, [103] completed
[104] completed.

Oncology

edit

health specialty (P1995) oncology (Q162555). Batches 6K lines of output.

QS code text file MeSH term listing notes Edit groups (QS additions, default 6K lines of code)
resultsoncology1 [105], stopped, [106] completed
[107] stopped, [108] completed
[109] completed
[110] stopped, [111] completed
[112] completed
[113] completed
[114] completed
[115] completed, at EOF.
resultsoncology2 [116] completed
[117] completed
[118] completed
[119] completed
[120] stopped, [121] completed
[122] completed
[123] completed
[124] completed
[125] stopped, [126] completed
[127] stopped, [128] completed
Blank run; [129] completed
[130] completed
[131] completed
[132] completed
[133] completed
[134] completed
[135] completed
[136] completed, at EOF.
resultsoncology3 [137] completed
[138] completed
[139] completed
[140] completed, EOF.
resultsoncology4 [141] completed
[142] completed
[143] completed
[144] completed
[145] stopped, [146] completed
[147] completed
[148] completed
[149] stopped, [150] completed
[151] stopped, [152] completed
[153] stopped, [154] completed
[155] completed
[156] (4K) completed
[157] (7K) completed
[158] (7K) completed
[159] (7.5K) stopped, [160] stopped, [161] completed to 6.5K
[162] completed
[163] (7.5K) stopped, [164] stopped, [165] stopped, [166] completed
[167] (6.5K) stopped, [168] stopped, [169] stopped, [170] completed
[171] stopped, [172] stopped, [173] completed
Blank runs, [174], blank running
[175] completed
Blank running, [176] completed
[177] completed, to EOF.
resultsoncology5 [178], running blank to 18K.

Other

edit
health specialty (P1995)/JSON filename MeSH term listing notes Edit groups (QS additions, default 5K lines of code)
gfinfectious1.json All to before "Virus Diseases" [179]
[180] restarted as [181], completed
[182] completed
[183] completed
[184] completed
[185] completed
[186] completed
[187] completed
[188] completed
[189] completed
[190] completed
[191] completed
[192] completed
[193] stopped at 57%, [194] completed
[195] stopped, [196] completed
[197] completed
[198] completed
[199] completed
[200] completed
[201] completed
[202] completed
[203] completed
[204] completed
[205] completed
Blank running, then [206] to EOF
[[207] stray run
gfinfectious2.json From "Virus Diseases" [208] completed
[209] completed
[210] completed
[211] completed
[212] completed
[213] completed, blanks to EOF.
gfendocrinology [214] completed
[215] completed
[216] completed
[217] completed
[218] completed
[219] completed
[220] completed
[221] completed, almost blank
[222] completed
[223] completed
[224] completed
[225] completed
[226] completed
[227] completed
[228] completed
[229] completed
[230] completed
[231] completed
[232] completed
[233] completed
[234] completed
[235] stopped, blank run, [236] stopped at 44.9%, [237] completed
[238] stopped, [239] completed
[240] stopped, [241] completed
[242] stopped, [243] completed
[244] stopped, [245] stopped, residue to next batch
[246], restarted as [247], stopped, [248] stopped, [249] stopped, blanks
[250]] stopped, [251] to EOF.
gfcardiology1 Omitting "Heart Diseases", "Cardiovascular diseases" [252] completed
[253] completed
[254] completed
[255] completed
[256] completed
[257] completed
[258] completed
[259], restarted as [260], completed
[261] completed
[262] completed
[263] completed, EOF
gfcardiology2 Omitting "Heart Diseases", "Cardiovascular diseases" [264] completed
[265] completed
[266] completed

[267] completed, EOF

gfgastroenterology Omitting "Intestinal Diseases", "Liver Diseases" [268] completed
[269] completed
[270] completed
[271] completed
[272] completed
[273] completed
[274] completed
[275] completed
[276] completed
[277] completed
[278] completed
[279] completed
[280] completed
[281] completed
[282] completed
[283] completed
[284] completed
[285] completed
[286] completed to EOF
gfrheumatology Omitting "Arthritis", "Connective Tissue Diseases" [287] completed
[288] completed
[289] completed
[290] completed
[291] completed
[292] completed
[293] completed
[294] completed
[295] completed
[296] completed
[297] completed, EOF
gfmisc1a/b Omitting "Vascular Diseases" [298] completed
[299] completed
[300] completed, [301] completed (split batch)
[302] completed
[303] completed
[304] completed
[305] completed
[306] completed
[307] completed
[308] completed
[309] completed
[310] completed
[311] completed
[312] completed
[313] completed
[314] completed
[315] completed
[316] completed
[317] completed, done misc1a
[318] completed, done misc1b
gfmisc2 [319] completed
[320] completed
[321] completed
[322] completed
[323] completed
[324] completed
[325] completed
[326] completed
[327], [328] completed (split batch)
[329] completed
[330] completed
[331] (new run) completed
[332] completed
[333] completed
[334] completed
[335] completed
[336] completed
[337] completed
[338] stopped at 99.6%
[339] completed
[340] completed, EOF

For more NCBI runs, from June 2019, see Wikidata:ScienceSource project/NCBI2wikidata additional.

Notes

edit
  1. #Otolaryngology reviews
    SELECT DISTINCT  ?item ?itemLabel 
    WHERE {
             ?item wdt:P31 wd:Q7318358;
                   wdt:P921 ?subject.
             ?subject wdt:P1995 wd:Q189553.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    
    Try it!
  2. Two of the reviews found, Animal Models of Middle Ear Cholesteatoma (Q30475583) ‎and Specific synaptopathies diversify brain responses and hearing disorders: you lose the gain from early life (Q30407365), were not tagged with a license by NCBI2wikidata, because the information given in Europe PMC XML is CC-BY, no edition number. The CC-BY statement has been added by hand. In one case, for Neuroauditory toxicity of artemisinin combination therapies-have safety concerns been addressed? (Q33841142), CC-BY was added, when the XML has CC-BY 4.0.[1] Corrected by hand.
  3. Based on
    SELECT DISTINCT  ?item
    WHERE {
             ?item wdt:P31 wd:Q7318358;
                   wdt:P921 ?subject.
             ?subject wdt:P1995 wd:Q203337.
                  
     
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    
    Try it!
  4. SELECT DISTINCT  ?item  ?licenseLabel
    WHERE {
             ?item wdt:P5008 wd:Q55439927;
                   wdt:P275 ?license;
                   wdt:P921 ?subject.
             ?subject wdt:P1995 wd:Q203337.
             MINUS {?item wdt:P275 wd:Q36795408}
             MINUS {?item wdt:P275 wd:Q6937225}
             MINUS {?item wdt:P275 wd:Q19125045}
             MINUS {?item wdt:P275 wd:Q24082749}
                  
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    
    Try it!
  5. SELECT DISTINCT ?item ?itemLabel ?pmcid ?journalLabel ?title ?date ?licenseLabel ?mainsubjectLabel
    WHERE {
             ?item wdt:P31 wd:Q7318358;
                   wdt:P5008 wd:Q55439927;
                   wdt:P932 ?pmcid;
                   wdt:P1433 ?journal;
                   wdt:P1476 ?title;
                   wdt:P577 ?date;
                   wdt:P275 ?license;
                   wdt:P921 ?mainsubject.           
             ?mainsubject wdt:P1995 wd:Q203337.
             MINUS {?item wdt:P275 wd:Q36795408}
             MINUS {?item wdt:P275 wd:Q6937225}
             MINUS {?item wdt:P275 wd:Q19125045}
             MINUS {?item wdt:P275 wd:Q24082749}
    
            SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
    
            }
     LIMIT 300
    
    Try it!
  6. SELECT DISTINCT ?item
      WHERE {
      ?item wdt:P31 wd:Q12136 .
      ?item wdt:P1995 ?medspec .
      ?medspec wdt:P361* wd:Q203337 .
     
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    
    Try it!
  7. SELECT DISTINCT ?item
      WHERE {
      ?item wdt:P31 wd:Q12140 .
      ?item wdt:P2175 ?condition  .
      ?condition wdt:P1995 wd:Q203337.
      
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    
    Try it!
  8. SELECT DISTINCT  ?item
    WHERE {
             ?item wdt:P31 wd:Q7318358;
                   wdt:P921 ?subject.
             ?subject wdt:P1995 wd:Q161437.
                  
     
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    
    Try it!
  NODES
INTERN 3
Note 10
Project 7