Description
Basic description
A new extension for providing information about an IP address without (1) the need for the user to use an external service themselves and (2) exposing the IP address itself to the user. (Actually hiding the IP addresses is beyond the scope of this project.)
This provides the user with information they could have retrieved from knowing the IP address. This information could be displayed in various ways (popup on hover/click, special page, etc.).
Data
Based on our investigation in T259726, the data our users are looking for is not accessible from freely licensed datasets alone. Ideally IPInfo would combine several datasets, but this depends on agreeing licences with different providers, and will only be considered in future iterations.
The first iteration of IPInfo will use only MaxMind's GeoIP2 databases, which we already have licences for, and which are already available on our servers: T263263#6534392. A PHP package providing an API for these databases is undergoing a security readiness review: T262963.
For third parties who do not have access to the proprietary datasets, IPInfo will use MaxMind's free GeoLite2 databases.
API
IPInfo provides two API endpoints, taking an edit id or a log id. If the edit or log was performed by an anonymous user (or the log _target is an anonymous user), the API returns data about the relevant IP address(es).
Client-side UI
Currently IPInfo adds a button next to IP addresses on Special:Log and history pages. The data are retrieved on clicking this button, and displayed in a popup. This design may change.
Deployment
We expect the feature to be deployed to all wikis, and be available on certain pages that show IP addresses. It will initially be available only to checkusers.
Preventing abuse
Sending an edit/log ID rather than the IP address will allow the IP address to remain hidden (once they become hidden in the future), and is intended to prevent IPInfo from being used as an API for getting information about arbitrary IP addresses.
Only users with the 'ipinfo' right can see the information. At first this right will only be given to checkusers. In the future when IP addresses are no longer visible, more users may need to access this information, e.g. for patrolling to fight vandalism. This will depend on user research and testing.
Users will sign an agreement when first using this tool, and a record will be kept of which users have access. Access may be taken away from a user, and a record will be kept of this too: T264150.
Preview environment
(Insert one or more links to where the feature can be tested, e.g. on Beta Cluster.)
Hosting the changes on Beta Cluster is a requirement prior to performance review. Please ensure that the feature can be used directly on the link(s) provided, without any data entry such as having to create an article. Any sample content needed should already be present.
If the changes cannot be hosted on Beta Cluster, explain why and provide links to an alternate public environment instead where the Performance Team can also SSH into. Links to code only is insufficient for a performance review.
IPInfo is available to logged-in users on our test environment: https://thegoodplace.wmcloud.org/index.php?title=Special:Log
Clicking buttons next to the IP addresses will result in a popup displaying either data or an error message.
Which code to review
(Provide links to all proposed changes and/or repositories. It should also describe changes which have not yet been merged or deployed but are planned prior to deployment. E.g. production Puppet, wmf config, or in-flight features expected to complete prior to launch date, etc.).
Gerrit: https://gerrit.wikimedia.org/r/admin/repos/mediawiki%2Fextensions%2FIPInfo|mediawiki/extensions/IPInfo
Github: https://github.com/wikimedia/mediawiki-extensions-IPInfo
Performance assessment
Please initiate the performance assessment by answering the below:
- What work has been done to ensure the best possible performance of the feature?
Data for each IP address is requested on demand, rather than holding up page load.
Data is only requested once for each log entry or edit on the page.
- What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?
We may be limited by the performance of GeoIP2 library.
In future versions, if more datasets and services are used, a likely bottleneck will be the performance and availability of the 3rd party webservice.
- Are there potential optimisations that haven't been performed yet?
Depending on the license and the product requirements, we may be able to cache revisions/logs made by anon users and make requests to our API without credentials (which will give all users the information from the varnish cache).
A new popup widget is appended for each button. The data for each log entry or edit is cached in the widget, but data for the same IP address may be requested more than once if it occurs in a different log entry or edit. We could re-use the same popup widget, and we could store a map of the data for each IP address rather than perform several requests for the same IP address (but different log/edit IDs). The design is subject to change, and we do not yet know how the tool will be used.
- Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: performance-team@wikimedia.org.
We plan on implementing logging via an EventLogging schema, including how long the requests take and how often the tool is used.