Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total passwords calculated incorrectly #4

Open
Markusnl opened this issue Jul 22, 2019 · 1 comment
Open

Total passwords calculated incorrectly #4

Markusnl opened this issue Jul 22, 2019 · 1 comment

Comments

@Markusnl
Copy link

I think the totalPasswords integer is calculated wrongly. Its currently calculated as

// Get total passwords by dividing by SHA length + 2 - 1
int totalPasswords = (bytes / 42) - 1;

Makes sense, hashes are formatted like this.
000000005AD76BD555C1D6D771DE417A4B87E4B4:4 (first hash in SHA1 V5 ordered by hash HIBP dataset)

So i'm guessing the "+2" are for the ':' and the x amount of times a password has been seen.
Herein lies the problem however, many hashes also have a 'x' > 9.

For example
00000000DD7F2A1C68A35673713783CA390C9E93:630 (third hash in SHA1 V5 ordered by hash HIBP dataset)

Over the many millions of passwords, this adds up quite quickly. According to the PwnedPasswordsDLL, there are 578 million passwords in the dataset, whilst there are only 551 million.

Im not sure yet what problems this could cause, but unlikely to be desirable.

@Markusnl
Copy link
Author

Alright, I've gotten some testing in.

The totalPasswords are miscalculated but this does not affect the functioning of the program. The additional passwords are duplicates from adjacent ones which is not an issue for the binary search algorithm. I don't see this being much worse than potentially adding an additional lookup here and there.

Passwords with x > 9 are handled by the getRecord function, which may just be a happy coincidence

First you set the stream to the required position
inFile.seekg(((long long)(position) * 42), inFile.beg);

And then skip all characters till a newline is found
inFile.ignore((std::numeric_limits<std::streamsize>::max)(), '\n');

The latter here readjusts the position of the stream to the start of the next line. Fixing any x > 9, but also disallowing ever getting the first line of a file. Although you've already covered that case.

     // For whatever reason, the first line is skipped. This should stop that.
	if (upper == -1)
	{
		file.close();
		std::ifstream file(files[ii].c_str(), std::ios::binary);
		char buffer[41] = "";

		// Read 40 chararacters into the buffer string
		file.getline(buffer, 41);

		std::string buff(buffer);

		if (buff.compare(searchValue) == 0)
		{
			passwordNotFound = FALSE;
		}
	}

All in all the DLL works as intended. I'm not sure if this issue warrants a fix, as from the looks of it, it will require extensive re-engineering. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant