Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Float.parse/1 #14159

Merged
merged 2 commits into from
Jan 9, 2025
Merged

Conversation

dkuku
Copy link
Contributor

@dkuku dkuku commented Jan 8, 2025

In #14130 I benchmarked building binaries vs building lists.
Building lists is faster, and as erlang uses lists natively we can leverage that in float parsing.
On average it is 1.5x faster no matter the input size:

Name                    ips        average  deviation         median         99th %                                                                                                                      
IOFloat.parse      239.03 K        4.18 μs   ±130.63%        4.05 μs        6.09 μs                                                                                                                      
Float.parse        149.34 K        6.70 μs   ±128.69%        5.75 μs       36.58 μs                                                                                                                      
                                                                                                                                                                                                         
Comparison:                                                                                                                                                                                              
IOFloat.parse      239.03 K                                                                                                                                                                              
Float.parse        149.34 K - 1.60x slower +2.51 μs                                                                                                                                                      
                                                                                                                                                                                                         
Extended statistics:                                                                                                                                                                                     
                                                                                             
Name                  minimum        maximum    sample size                     mode         
IOFloat.parse         3.72 μs     3021.96 μs       455.37 K                  3.96 μs         
Float.parse           5.28 μs     2353.56 μs       289.41 K                  5.72 μs         
                                                                                             
Memory usage statistics:                                                                     
                                                                                             
Name             Memory usage                                                                
IOFloat.parse        12.27 KB                                                                
Float.parse          12.70 KB - 1.04x memory usage +0.44 KB                                  
                                                                                             
**All measurements for memory usage were the same**                                                 
                                                                                             
Profiling IOFloat.parse with eprof...                                                        
                                                                                             
Profile results of #PID<0.221596.0>                                                          
#                                           CALLS     % TIME µS/CALL
Total                                         450 100.0  100    0.22
Enum.map/2                                      1  0.00    0    0.00                                
anonymous fn/0 in CalendarStringBench.run/0     1  0.00    0    0.00                                
:erlang.apply/2                                 1  4.00    4    4.00                                
IOFloat.parse_unsigned/1                       33  5.00    5    0.15                         
IOFloat.add_dot/2                              43  6.00    6    0.14                                
:lists.reverse/1                               29  6.00    6    0.21                         
Enum."-map/2-lists^map/1-1-"/2                 34  7.00    7    0.21                                
:lists.reverse/2                               29  9.00    9    0.31                                
IOFloat.parse/1                                33 16.00   16    0.48                                
:erlang.list_to_float/1                        29 20.00   20    0.69                                
IOFloat.parse_unsigned/4                      217 27.00   27    0.12
                                                                                             
Profile done over 11 matching functions                                                      
                                                                                             
Profiling Float.parse with eprof...                                                          
                                                                                             
Profile results of #PID<0.221598.0>                                                          
#                                           CALLS     % TIME µS/CALL                         
Total                                         392 100.0  126    0.32                         
Enum.map/2                                      1  0.00    0    0.00
anonymous fn/0 in CalendarStringBench.run/0     1  0.00    0    0.00                         
:erlang.apply/2                                 1  0.79    1    1.00                         
Float.parse_unsigned/1                         33  4.76    6    0.18                        
Enum."-map/2-lists^map/1-1-"/2                 34  6.35    8    0.24                                         
Float.add_dot/2                                43  7.14    9    0.21                                         
Float.parse/1                                  33  8.73   11    0.33                                         
:erlang.binary_to_float/1                      29 21.43   27    0.93                                         
Float.parse_unsigned/4                        217 50.79   64    0.29    

My benchee module - I took the floats from test module:

defmodule CalendarStringBench do
  def run do
    # Sample date, time, and datetime for benchmarking
    floats = [
      "12",
      "-12",
      "-0.1",
      "123456789",
      "12.5",
      "12.524235",
      "-12.5",
      "-12.524235",
      "0.3534091",
      "0.3534091elixir",
      "7.5e3",
      "7.5e-3",
      "12x",
      "12.5x",
      "-12.32453e10",
      "-12.32453e-10",
      "0.32453e-10",
      "1.32453e-10",
      "1.7976931348623159e-99999foo",
      "1.32.45",
      "1.o",
      "+12.3E+4",
      "+12.3E-4x",
      "-1.23e-0xFF",
      "-1.e2",
      ".12",
      "--1.2",
      "++1.2",
      "pi",
      "1.7976931348623157e308",
      "1.7976931348623157e308foo",
      "1.7976931348623157e+308foo",
      "1.7976931348623157e-308foo"
    ]

    Benchee.run(
      %{
        "Float.parse" => fn ->
          Enum.map(floats, &Float.parse/1)
        end,
        "IOFloat.parse" => fn ->
          Enum.map(floats, &IOFloat.parse/1)
        end
      },
      profile_after: true,
      time: 2,
      memory_time: 2,
      formatters: [
        {Benchee.Formatters.Console, extended_statistics: true, precentiles: true}
      ]
    )
  end
end

# Run the benchmark
CalendarStringBench.run()

for single value - no enum map just the function call:
float = "-12.5"
fn -> Float.parse(float) end

Name                    ips        average  deviation         median         99th %                                  
IOFloat.parse        6.73 M      148.50 ns ±12791.77%         110 ns         201 ns                                  
Float.parse          4.05 M      247.18 ns  ±8164.14%         161 ns         330 ns      
float = "-12.524235"
Name                    ips        average  deviation         median         99th %                                                                                                                                                       
IOFloat.parse        5.89 M      169.75 ns  ±1845.05%         160 ns         261 ns                                                                                                                                                       
Float.parse          3.64 M      275.07 ns  ±3312.69%         231 ns         421 ns   
float = "-12.524235123456789"
Name                    ips        average  deviation         median         99th %                                                                                                                                                       
IOFloat.parse        4.08 M      244.99 ns  ±4717.99%         220 ns         381 ns                                                                                                                                                       
Float.parse          2.42 M      413.63 ns  ±2895.51%         321 ns         541 ns  
float = "-12.524235e100"
Name                    ips        average  deviation         median         99th %
IOFloat.parse        4.78 M      209.20 ns  ±3218.10%         190 ns         331 ns
Float.parse          3.00 M      333.27 ns  ±3416.01%         280 ns         490 ns

@sabiwara sabiwara changed the title dk_optimize_float_parsing Optimize Float.parse/1 Jan 8, 2025
Copy link
Contributor

@sabiwara sabiwara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great finding, thanks @dkuku! 💜

@josevalim josevalim merged commit 194ecfd into elixir-lang:main Jan 9, 2025
9 checks passed
@josevalim
Copy link
Member

💚 💙 💜 💛 ❤️

@vanderhoop
Copy link
Contributor

@dkuku 👏 Given your work with IO data in #14130 and now in this PR... I'm wondering if you (or others) have ideas about other spots you think could benefit from similar optimizations?

@dkuku
Copy link
Contributor Author

dkuku commented Jan 9, 2025

The language core is well optimized.
Even in cases where benchmarks show a 50% difference, we're only talking about nanoseconds in rare scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants