Page MenuHomePhabricator

LoginNotify should inform users of the IP address of failed login attempts to their account
Open, HighPublic15 Estimated Story Points

Assigned To
None
Authored By
MarcoAurelio
Aug 28 2017, 8:00 PM
Referenced Files
F15216678: Capture.PNG
Mar 13 2018, 12:41 AM
F10458991: Untitled-1.png
Oct 27 2017, 2:00 AM
F10455617: Screen Shot 2017-10-26 at 1.05.59 PM.png
Oct 26 2017, 8:11 PM
Tokens
"Like" token, awarded by Nemoralis."Love" token, awarded by Ebrahim."Like" token, awarded by Acagastya."Like" token, awarded by Thibaut120094."Love" token, awarded by Gryllida."Doubloon" token, awarded by TheDJ.

Description

When someone tries to reset our password, be it ourselves or third parties, the IP address of the requestor of the password reset is sent to our inbox with the password reset email. LoginNotify should do the same.

I don't think there would be major privacy concerns as long as its noted where appropriate that trying to login on an account may disclose private data to the owner of that account, if that's not already covered by https://wikimediafoundation.org/wiki/Privacy_policy#To_Protect_You.2C_Ourselves_.26_Others. This will need to be checked with WMF Legal.

Similarly, unsuccessful logins should leave CU traces to prevent abuse, otherwise this feature can become a source of annoyance.

New Notification Data Form

Filling out this form will help developers and product people understand your idea and will provide the information required to implement it. To see examples of the types of answers required, have a look at this sample form. To understand unfamiliar terms, visit the glossary. 

Basic information

  - Purpose of the notification:  To inform the user about failed login attempts to his account
  - Notification name:  Unchanged, reusing notification-known-header-login-fail notification from LoginNotify
  - What triggers notification?: Login attempts
  - "Notice" or "Alert"?:  Alert
  - Notification type (standard, bundled, expandable bundle):  standard, I think (unchanged from the existing notification)

Wording

For a single message

  • Header: Unchanged
  • Body:  Added a new body which reads "IP address of the last login attempt: $1" where $1 is replaced with the IP address

For Bundled Messages

  • Main, bundling message:
  • Subsidiary, bundled message:

Links

  - Primary link _target: None added
  - Primary link label (for email display only): None added

  - #1 secondary link _target: 
  - #1 secondary link label:

  - #2 secondary link _target:
  - #2 secondary link label:

Icon

  - Icon name:  Unchanged
  - Link to graphic/example: Unchanged

See also

T174562: LoginNotify should inform users of the IP address of successful login attempts to their account
T249408: Show useragent data and username on new device login emails

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Alright, I finally figured out how to handle the passing of parameters. Right now, what is passed is the IP address itself (which as we agreed above is bad).

As soon as I hear back from @MaxSem here about the cons of pros of the callback function in CheckUser returning a row ID, I will work on using that (instead of the IP) here.

I think I know why @MaxSem recommended not returning a value from CheckUserHooks::onAuthManagerLoginAuthenticateAudit(). Both that and LoginNotify\Hooks::onAuthManagerLoginAuthenticateAudit() use the AuthManagerLoginAuthenticateAudit hook; therefore, not only we cannot guarantee that the CU callback function is run before the LN callback function, but also, there is no way to pass the output of the CU callback function (may it be a cu_id) to the LN function in order to use it for generating the Echo notification.

The patch I submitted is now fully functional (shows the IP address of the login attempt correctly); however, in its current form, it ends up storing the user's IP in the echo_event table as a serialized value in the event_extra column, like this example: a:3:{s:11:"notifyAgent";b:1;s:2:"ip";s:7:"100.200.103.206";s:5:"count";i:79;} (where 100.200.103.206 was the IP address).

This is problematic because there is no way to purge just the IP from this table. The only alternatives that come to my mind are:

  1. To write a purge script that completely removes the rows that have event_type of "login-fail-new".
  2. To add a new column to that table called event_extra_private which would only contain private information, and then purge that column.

Approach 2 is what AbuseFilter does (it has an afl_ip field which stores the IP address associated with an abuse log entry, and is purged regularly). I think it is best to have a serialized field and not a plain field like the case of AbuseFilter. The reason is we are envisioning a future in which we would want to show more private data than just the IP (hence T174553). So it is best to just have one field in which we store the IP, the GeoNames ID for the city/country, the ISP subnet, etc. and have a purge script that just does that.

I am going to stop here. @Niharika I know that you are eager for this work to be completed; so am I. But I think we are now facing an important data architecture question which is best answered by a more seasoned MediaWiki developer. To the best of my knowledge, the original plan of just storing a cu_id in the Echo tables is impossible, but I could be wrong. If I am not wrong, then we have to make a choice between 1 & 2, and if we choose 2, then we need to patch Echo (which I am happy to take a stab at, but I would prefer to be done by someone more familiar with Echo as well).

@Huji: Echo notifications are meant to be transient, not permanent, and I think with LN notifications especially, there is no reason they should be retained indefinitely. It seems like the easiest solution would be to purge all LN notifications after 90 days (the entire notification, not just the event_extra column), possibly with the same clean-up mechanism that purges all notifications after 2000 (per user).

I'm happy to quickly modify my patch to put the IP in event_extra as soon as a 90-day purging script is made for Echo and enabled on WMF.

This task would probably need to be added to includes/jobs/NotificationDeleteJob.php.

@kaldari no. That job is only run when a notification is sent. Even though you might think it is safe to assume that there are so many notifications that the job will be called many times a minute, that is only correct for big projects (such as English Wikipedia). Smalller WMF projects (such as newly created wikis, the Ombudsmen Wiki, etc.) may not have even a single notification for several weeks, and this makes is theoretically possible for us to retain data beyond the retention period.

Therefore, we should do it the right way, which is create a new maintenance script in Notifications similar to https://phabricator.wikimedia.org/diffusion/ECHU/browse/master/maintenance/purgeOldData.php and schedule it to be run a regular basis (e.g. daily).

if we schedule and require something to run daily, we should probably have an internal error/whistleblower (possibly in NotificationDeleteJob.php) when we detect that such a cleanup script is NOT running.

Hmm, can we just pass cu_changes.cuc_id and load from Checkuser data, thus keeping all the private information in one place?

@MaxSem please see T174388#4048541 in which I explained why that is not possible. Can I ask you to confirm my analysis is correct?

I don't think there would be major privacy concerns

This assumes that all login attempts to the wrong username are malicious. I imagine a lot can be attributed to typos.

This could also be used to extract _target users' IP addresses by deliberately registering multiple accounts which are common typos of a _target usernames. The chance of success for single user might not be that high, but used against a large list of users it could be successful.

I don't think there would be major privacy concerns

This assumes that all login attempts to the wrong username are malicious. I imagine a lot can be attributed to typos.

This could also be used to extract _target users' IP addresses by deliberately registering multiple accounts which are common typos of a _target usernames. The chance of success for single user might not be that high, but used against a large list of users it could be successful.

But we already have a safeguard for that (through AntiSpoof): we don't allow one to create accounts with usernames too similar to an existing account.

AntiSpoof isn't foolproof though, e.g. it disallows Тhryduulf (the first letter is Cyrillic) but probably not Thryduuulf (too many 'u's) or Awkwrad42 (typo for my alt Awkward42).

That is fair.

I wonder how other service providers (such as Facebook or Google) approach this, as we know they have had a similar feature for years.

I would definitely support this. We see it everywhere else, on edits, password resets and so on. So login attempts would be a natural progression. What about 2 Factor authentication for all Wikipedia users also?

I would definitely support this. We see it everywhere else, on edits, password resets and so on. So login attempts would be a natural progression. What about 2 Factor authentication for all Wikipedia users also?

Unfortunately there are scalability issues we need to iron out before we can do something like this. (e.g. when someone forgets their 2FA codes or loses their phone, there is no easy way to get them back without asking a developer to alter the database.)

I'm personally of the opinion that we should have 2FA for all users, but at present there is no failsafe if things go wrong with it.

@Huji any updates? Are you still working on this?

Huji removed Huji as the assignee of this task.Nov 23 2020, 1:05 PM

The patch is still relevant. But I am going to unassign myself.

This is still a major issue that needs attention.

@Piotrus: Please feel free to improve the proposed patch if you'd like to see progress. Thanks.

@Piotrus: Please feel free to improve the proposed patch if you'd like to see progress. Thanks.

I am all for BOLD I am not a coder. I write Wikipedia articles and I expect coders to make the software so I can continue my work.

@Piotrus: See https://www.mediawiki.org/wiki/Bug_management/Development_prioritization for some background - basically, there is no box full of coders with too much time who could fix all and any incoming bug reports, unfortunately.

  NODES
COMMUNITY 2
Idea 1
idea 1
INTERN 1
Note 4
Project 7
USERS 15