Page MenuHomePhabricator

uploaded SVGs from 2015/2016 without XML prolog (<?xml ... ?>) served with Content-Type text/html from upload.wikimedia.org
Open, LowPublic

Description

Request context: Loading SVG as background image from this CSS file.

$ curl -I 'https://upload.wikimedia.org/wikipedia/commons/archive/c/c9/20160327111940!Ballot-sprite.svg'
HTTP/1.1 200 OK
Date: Sun, 27 Mar 2016 13:25:03 GMT
Content-Type: text/html
Connection: keep-alive
X-Object-Meta-Sha1base36: k1h4wypkf9tobg8qw2dymx0qab6s6on
Last-Modified: Sun, 27 Mar 2016 11:19:42 GMT
Etag: 8a196dc538b917f2153323c7e91bd8ca
X-Timestamp: 1459077581.66301
X-Trans-Id: txb8a16672748b4be7a5baf-0056f7df2e
X-Varnish: 3801160535, 3256140095, 2826967039
Via: 1.1 varnish, 1.1 varnish, 1.1 varnish
Age: 0
X-Cache: cp1062 miss(0), cp3037 miss(0), cp3035 frontend miss(0)
Strict-Transport-Security: max-age=31536000
Set-Cookie: WMF-Last-Access=27-Mar-2016;Path=/;HttpOnly;Expires=Thu, 28 Apr 2016 12:00:00 GMT
X-Analytics: https=1;nocookies=1
X-Client-IP: 91.218.200.230
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
Timing-Allow-Origin: *

Expected: Content-Type: image/svg+xml

Therefore, I can't use uploaded SVGs as background images for POTY. Firefox simply doesn't display it. It says an error occurred somewhere on some background property but the developer tools are currently broken and do not display the correct location (always show something in OOjs UI).

Event Timeline

Restricted Application added a subscriber: Steinsplitter. · View Herald Transcript

I can reproduce:

$ curl -I https://upload.wikimedia.org/wikipedia/commons/c/c9/Ballot-sprite.svg
HTTP/1.1 200 OK
Date: Sun, 27 Mar 2016 01:55:57 GMT
Content-Type: text/html
Connection: keep-alive
X-Object-Meta-Sha1base36: k1h4wypkf9tobg8qw2dymx0qab6s6on
Last-Modified: Sat, 26 Mar 2016 22:19:52 GMT
Etag: 8a196dc538b917f2153323c7e91bd8ca
X-Timestamp: 1459030791.20288
X-Trans-Id: txef394c2f19eb4001ac440-0056f70b27
X-Varnish: 3427899152, 1369810714 1336490883, 2690276095 2690257490
Via: 1.1 varnish, 1.1 varnish, 1.1 varnish
Age: 12933
X-Cache: cp1050 miss(0), cp3049 hit(3), cp3035 frontend hit(1)
Strict-Transport-Security: max-age=31536000
Set-Cookie: WMF-Last-Access=27-Mar-2016;Path=/;HttpOnly;Expires=Thu, 28 Apr 2016 00:00:00 GMT
X-Analytics: https=1;nocookies=1
X-Client-IP: 91.218.200.230
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
Timing-Allow-Origin: *

But not for a different file:

$ curl -I https://upload.wikimedia.org/wikipedia/commons/2/28/Information.svg
HTTP/1.1 200 OK
Date: Sun, 27 Mar 2016 01:55:34 GMT
Content-Type: image/svg+xml
Connection: keep-alive
X-Object-Meta-Sha1base36: 69nihwcga0f31wljnlxxv0zcpikni5k
Last-Modified: Mon, 14 Oct 2013 22:22:12 GMT
Etag: 34ce71403dcf6c7b3e9e55a75b5d8e7b
X-Timestamp: 1381789331.37070
X-Trans-Id: txe690913510da4538b6af4-0056f5bf56
X-Varnish: 2887756974 2883147041, 2371719403 2091834454, 2690222811 2690184231
Via: 1.1 varnish, 1.1 varnish, 1.1 varnish
Age: 97856
X-Cache: cp1048 hit(1), cp3048 hit(8), cp3035 frontend hit(1)
Strict-Transport-Security: max-age=31536000
Set-Cookie: WMF-Last-Access=27-Mar-2016;Path=/;HttpOnly;Expires=Thu, 28 Apr 2016 00:00:00 GMT
X-Analytics: https=1;nocookies=1
X-Client-IP: 91.218.200.230
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
Timing-Allow-Origin: *

Your SVG image doesn't have an XML prolog (something like <?xml version="1.0" encoding="UTF-8" standalone="no"?>), such SVGs have confused various tools in the past (T61234). Perhaps something is mis-detecting the MIME type. Try adding it and see if that fixes it.

Perhaps something is mis-detecting the MIME type. Try adding it and see if that fixes it.

Interesting. Thanks, this did it. Is this still considered an issue with the MIME type detection?

BTW, now one has to fetch an old version

$ curl -I https://upload.wikimedia.org/wikipedia/commons/archive/c/c9/20160327111940!Ballot-sprite.svg

for reproducing.

Interesting. Thanks, this did it. Is this still considered an issue with the MIME type detection?

I'd say it's still a bug. SVG files are allowed not to have the XML prolog (if they weren't, MediaWiki shouldn't allow you to upload one).

matmarex renamed this task from SVGs from upload.wikimedia.org served as content-type text/html to SVGs without XML prolog (<?xml ... ?>) served with Content-Type text/html from upload.wikimedia.org.Mar 27 2016, 1:26 PM
matmarex updated the task description. (Show Details)

I'm not sure if this is an Apache configuration problem or what, I'm really using this to mean "The Servers" and hoping somebody can figure out what's producing the incorrect header. ;)

from upload? I don't think it can be apache

krenair@tin:~$ set +H
krenair@tin:~$ curl -I -H "Host: upload.wikimedia.org" "http://ms-fe.svc.eqiad.wmnet/wikipedia/commons/archive/c/c9/20160327111940!Ballot-sprite.svg"
HTTP/1.1 200 OK
Content-Length: 4159
X-Object-Meta-Sha1base36: k1h4wypkf9tobg8qw2dymx0qab6s6on
Accept-Ranges: bytes
Last-Modified: Sun, 27 Mar 2016 11:19:42 GMT
Etag: 8a196dc538b917f2153323c7e91bd8ca
X-Timestamp: 1459077581.66301
Content-Type: text/html
Access-Control-Allow-Origin: *
X-Trans-Id: tx11949a4f60ee45869ddbc-0056f7e726
Date: Sun, 27 Mar 2016 13:59:02 GMT
faidon subscribed.

Swift just serves whatever Content-Type it was set to the object when it was uploaded to it by MediaWiki — it never performs any content sniffing. For that file above, this was indeed set to text/html:

root@ms-fe1001:~# swift stat wikipedia-commons-local-public.c9 'archive/c/c9/20160327111940!Ballot-sprite.svg'
       Account: AUTH_mw
     Container: wikipedia-commons-local-public.c9
        Object: archive/c/c9/20160327111940!Ballot-sprite.svg
  Content Type: text/html
Content Length: 4159
 Last Modified: Sun, 27 Mar 2016 11:19:42 GMT
          ETag: 8a196dc538b917f2153323c7e91bd8ca
Meta Sha1Base36: k1h4wypkf9tobg8qw2dymx0qab6s6on
 Accept-Ranges: bytes
   X-Timestamp: 1459077581.66301
    X-Trans-Id: txe252e85f575840e2868d2-0056f80bf0

Ugh… apparently SwiftFileBackend does its own MIME type detection from scratch, instead of using the type that MediaWiki already knows. (This is new as of de290cd02db7150549da5cc66c9af3de6933a68b.)

  • SwiftFileBackend::doCreateInternal() and SwiftFileBackend::doStoreInternal() both call FileBackendStore::getContentType()
  • FileBackendStore::getContentType() uses either:
    • $this->mimeCallback, if present, which AFAICS can only be set to FileBackendGroup::guessMimeInternal (in FileBackendGroup::get())
    • finfo_file() otherwise

As FileBackendGroup::guessMimeInternal() will normally just look at the file extension (if there is one), my guess is that something is causing mimeCallback to not be set, and the code using the finfo_file() fallback. I'm not sure if it's a bug in de290cd02db7150549da5cc66c9af3de6933a68b or some more recent change – @faidon, is it possible to ask Swift about the oldest SVG files with text/html content-type that it has?

And the finfo_file() fallback is definitely subpar, the last issue described in T61234 was probably fixed when we switched to HHVM (or recently when we upgraded to 5.5), but here's what I get locally with file 5.04:

$ file Ballot-sprite.svg
Ballot-sprite.svg: SVG Scalable Vector Graphics image

$ file "20160327111940!Ballot-sprite.svg"
20160327111940!Ballot-sprite.svg: HTML document text

@faidon, is it possible to ask Swift about the oldest SVG files with text/html content-type that it has?

There is no such operation, but container listings do list the content-type of the objects underneath them, which would avoid a (very expensive) stat call against each object. You'd basically need to write something to list all container shards for all projects and then filter on name (ending in .svg) and content-type. It wouldn't take much time to write and run but it won't be instant.

Something new:

https://upload.wikimedia.org/wikipedia/commons/c/c5/1929_wall_street_crash_graph.svg

Content-Type application/xml

Note the lack of +svg

There is no application/xml+svg MIME type, only image/xml+svg (or of course generic application/xml and text/xml)

Wrong or too generic MIME type causes browser not to render svgs in <img> tags.

MarkTraceur triaged this task as Medium priority.Dec 5 2016, 9:19 PM
MarkTraceur moved this task from Untriaged to Triaged on the Multimedia board.
matthiasmullie lowered the priority of this task from Medium to Low.Jun 27 2017, 1:54 PM
matthiasmullie raised the priority of this task from Low to Medium.
matthiasmullie moved this task from Triaged to Needs code review on the Multimedia board.
matthiasmullie moved this task from Needs code review to Next up on the Multimedia board.
MarkTraceur lowered the priority of this task from Medium to Low.Aug 28 2017, 4:48 PM
MarkTraceur moved this task from Next up to Triaged on the Multimedia board.
JoKalliauer renamed this task from SVGs without XML prolog (<?xml ... ?>) served with Content-Type text/html from upload.wikimedia.org to uploaded SVGs from 2015/2016 without XML prolog (<?xml ... ?>) served with Content-Type text/html from upload.wikimedia.org.Aug 25 2024, 12:27 PM

I think we should fix all files, by making any edit (e.g. adding ´<?xml version="1.0" encoding="UTF-8"?>´), as I did on the last mentioned file.
Since it does not affect newly uploaded files we might consider closing it.
It only affects files from ~March 2016? (e.g. 7.March 2016 to 27.March 2016)

  NODES
3d 3
coding 2
HOME 1
Interesting 2
Intern 4
multimedia 9
Note 3
os 9
server 1
swift 9
text 19