Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary accumulator initialization with loop peeling #16

Open
wants to merge 3 commits into
base: ws
Choose a base branch
from

Conversation

htyu
Copy link
Contributor

@htyu htyu commented Dec 20, 2024

Adding a loop peeling pass that automatically translates a matmul loop into a peeled version to avoid the zero out of the accumulator.

So a normal K-loop is like

        accumulator = tl.zeros((BLOCK_M, BLOCK_N), dtype=tl.float32)
        for k in range(0, tl.cdiv(K, BLOCK_K)):
                a = tl.load k
                b = tl.load k
                accumulator = tl.dot(a, b, accumulator)

With loop peeling we get

     a = tl.load 0
     b = tl.load 0
     accumulator = tl.dot(a, b)
     for k in range(1, tl.cdiv(K, BLOCK_K)):
            a = tl.load k
            b = tl.load k
            accumulator = tl.dot(a, b, accumulator)

The pass is placed after the software pipeliner to avoid messing up with it.

To ensure the correct behavior when the loop is not executed at all, tl.assume is needed on the source level.

        tl.assume(tl.cdiv(K, BLOCK_K) > 0)
        accumulator = tl.zeros((BLOCK_M, BLOCK_N), dtype=tl.float32)
        for k in range(0, tl.cdiv(K, BLOCK_K)):
                a = tl.load k
                b = tl.load k
                accumulator = tl.dot(a, b, accumulator)

The peeling pass is only triggered for constant bounded loops or with tl.assume.

3% win was seen on some shapes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 20, 2024
@htyu htyu changed the title [WIP] Add a loop peeling pass Remove unnecessary accumulator initialization with loop peeling Dec 20, 2024
@htyu htyu requested review from bertmaher and manman-ren December 20, 2024 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants