Speculative decoding: Test the _target distribution (to prevent issues like #32867) #34553

keyboardAnt · 2024-11-01T00:11:26Z

What does this PR do?

This PR introduces a test for speculative decoding to ensure the _target distribution is preserved, addressing potential issues similar to #32867. The added test (test_speculative_sampling__target_distribution) validates that tokens are generated according to their intended likelihood, as defined in the logits, ensuring that the speculative decoding process adheres to expected distributions. Additionally, this is a foundational step toward supporting advanced speculative decoding algorithms, such as token-tree-based rejection sampling, which will enhance flexibility and performance in future implementations.

Motivation and Context

The speculative decoding process has previously encountered issues where the _target distribution was not preserved (e.g., in issues #32867 and #33534). This PR implements a test to safeguard against such inconsistencies by verifying that:

The most likely tokens are chosen more frequently than less probable ones.
Tokens are selected in alignment with the predefined candidate and new logits.

This enhancement not only improves the reliability of speculative sampling by enforcing distributional accuracy but also prepares the ground for implementing more advanced speculative decoding techniques, like token-trees-based sampling.

This PR is an initial step toward advancements in Universal Assisted Generation. In collaboration with @orenpereg, @danielkorat, @mosheber, @jmamou, and @MosheWasserb, we're preparing for a new speculative decoding function that this test will verify for losslessness in _target distribution preservation.

Dependencies

No additional dependencies are required.

Linked Issues

#32867, #33534

Before Submitting Checklist

I have read the contributor guidelines.
Documentation updates are not needed as this is a test enhancement.
New test coverage has been added to verify the speculative sampling behavior.

Who can review?

@gante

gante

Thank you for working on improving our tests 💛

A question: is this test somewhat fast to run (<5s)? If yes, amazing! If no, let's either a) reduce the number in range or b) tag the test as @slow [note: tests with @slow are usually run daily, so bad commits may squeeze in]

gante · 2024-11-04T11:07:38Z

tests/generation/test_utils.py

+                    [
+                        -inf,
+                        2.0,
+                        -inf,
+                        1.0,
+                        -inf,
+                        -inf,
+                        -inf,
+                        -0.01,
+                        2.0,
+                        -inf,
+                    ],  # most likely to be 1 or 8, less likely to be 3, then 7, and should never be any other value


Nit: let's make it in one line, so we can quickly compare indexes with other tensors.

(you'll have to remove the comma after the last -inf, otherwise the make fixup command will make it revert back to this format)

I changed the formatting as requested, but ruff's formatting check then failed the CI. (make fixup still reformats it into a column, even after removing the last comma you mentioned)

you can use # fmt: off and # fmt: on

Thanks @ArthurZucker. I changed all these inline comments to block comments, and it solved the issue while keeping the ruff checks on. 👍

…ub.com/keyboardAnt/transformers into test-speculative-sampling-distribution

keyboardAnt · 2024-11-05T01:23:53Z

Thank you for working on improving our tests 💛

A question: is this test somewhat fast to run (<5s)? If yes, amazing! If no, let's either a) reduce the number in range or b) tag the test as @slow [note: tests with @slow are usually run daily, so bad commits may squeeze in]

The test itself takes 1.89 s, and when you run it with pytest (pytest tests/generation/test_utils.py::UtilsFunctionsTest::test_speculative_sampling__target_distribution), it's not more than 2.57 s. Although it was only tested locally on my laptop, I believe it’s safe to keep it with the rest of the <5s tests.

ArthurZucker

Thanks

ArthurZucker · 2024-11-05T10:10:40Z

tests/generation/test_utils.py

+                    [
+                        -inf,
+                        2.0,
+                        -inf,
+                        1.0,
+                        -inf,
+                        -inf,
+                        -inf,
+                        -0.01,
+                        2.0,
+                        -inf,
+                    ],  # most likely to be 1 or 8, less likely to be 3, then 7, and should never be any other value


you can use # fmt: off and # fmt: on

…ub.com/keyboardAnt/transformers into test-speculative-sampling-distribution

keyboardAnt · 2024-11-06T01:14:43Z

All checks have successfully passed (screenshot below). Are there any additional workflows to run before merging?

keyboardAnt · 2024-11-13T20:26:47Z

@gante @ArthurZucker
Would appreciate your help with finalizing the review

ArthurZucker · 2024-11-22T15:02:20Z

Yep sorry for the delay, merging!

… like huggingface#32867) (huggingface#34553) * Update test_utils.py * formatting * Update test_utils.py * formatting * formatting * Update test_utils.py * formatting * Update test_utils.py * formatting * format * comments at standard positions

keyboardAnt added 4 commits October 31, 2024 19:59

Update test_utils.py

87e11f0

formatting

1be059c

Update test_utils.py

d9c3fdc

formatting

5522333

keyboardAnt force-pushed the test-speculative-sampling-distribution branch from 1be059c to 5522333 Compare November 1, 2024 00:14

gante approved these changes Nov 4, 2024

View reviewed changes

gante requested a review from ArthurZucker November 4, 2024 11:11

keyboardAnt added 3 commits November 4, 2024 20:11

formatting

4686ec0

Merge branch 'test-speculative-sampling-distribution' of https://gith…

67f9cb1

…ub.com/keyboardAnt/transformers into test-speculative-sampling-distribution

Update test_utils.py

20b2cef

keyboardAnt added 3 commits November 4, 2024 20:33

formatting

3a05527

Update test_utils.py

90d1ee0

formatting

dc3be00

keyboardAnt force-pushed the test-speculative-sampling-distribution branch from 3a05527 to dc3be00 Compare November 5, 2024 02:19

ArthurZucker approved these changes Nov 5, 2024

View reviewed changes

keyboardAnt added 4 commits November 5, 2024 19:51

Merge branch 'test-speculative-sampling-distribution' of https://gith…

2fa00e0

…ub.com/keyboardAnt/transformers into test-speculative-sampling-distribution

format

7d84f6d

comments at standard positions

cb6fcd7

Merge branch 'main' into test-speculative-sampling-distribution

7352d60

keyboardAnt mentioned this pull request Nov 16, 2024

[OLD] New PR: #35029. [[Universal Speculative Decoding CandidateGenerator]] #34760

Closed

6 tasks

keyboardAnt requested review from gante and ArthurZucker November 19, 2024 22:00

ArthurZucker merged commit 42b36d7 into huggingface:main Nov 22, 2024
22 checks passed

keyboardAnt deleted the test-speculative-sampling-distribution branch November 22, 2024 15:36

keyboardAnt mentioned this pull request Nov 30, 2024

Universal Speculative Decoding CandidateGenerator #35029

Open

6 tasks

keyboardAnt mentioned this pull request Dec 13, 2024

Add unittests for Universal Assisted generation keyboardAnt/transformers#8

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speculative decoding: Test the _target distribution (to prevent issues like #32867) #34553

Speculative decoding: Test the _target distribution (to prevent issues like #32867) #34553

keyboardAnt commented Nov 1, 2024 •

edited

Loading

gante left a comment

gante Nov 4, 2024 •

edited

Loading

keyboardAnt Nov 5, 2024 •

edited

Loading

ArthurZucker Nov 5, 2024

keyboardAnt Nov 6, 2024

keyboardAnt commented Nov 5, 2024

ArthurZucker left a comment

ArthurZucker Nov 5, 2024

keyboardAnt commented Nov 6, 2024

keyboardAnt commented Nov 13, 2024

ArthurZucker commented Nov 22, 2024

Speculative decoding: Test the _target distribution (to prevent issues like #32867) #34553

Speculative decoding: Test the _target distribution (to prevent issues like #32867) #34553

Conversation

keyboardAnt commented Nov 1, 2024 • edited Loading

What does this PR do?

Motivation and Context

Dependencies

Linked Issues

Before Submitting Checklist

Who can review?

gante left a comment

Choose a reason for hiding this comment

gante Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

keyboardAnt Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

ArthurZucker Nov 5, 2024

Choose a reason for hiding this comment

keyboardAnt Nov 6, 2024

Choose a reason for hiding this comment

keyboardAnt commented Nov 5, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Nov 5, 2024

Choose a reason for hiding this comment

keyboardAnt commented Nov 6, 2024

keyboardAnt commented Nov 13, 2024

ArthurZucker commented Nov 22, 2024

keyboardAnt commented Nov 1, 2024 •

edited

Loading

gante Nov 4, 2024 •

edited

Loading

keyboardAnt Nov 5, 2024 •

edited

Loading