For memory profiling Python memory-profilerversion 0.61.0 was used. A helper run-memory-profiler.py script was developed and a sample invocation was used to run the tests as it is presented in the snippet below:
The AIList dataset after transcoding into the Parquet file format (with the Snappy compression) was used for benchmarking.
This dataset was published with the AIList paper:
Jianglin Feng , Aakrosh Ratan , Nathan C Sheffield, Augmented Interval List: a novel data structure for efficient genomic interval search, Bioinformatics 2019.
Results for overlap, nearest, count-overlaps, and coverage operations with single-thread performance on apple-m3-max and gcp-linux platforms.
apple-m3-max
1-2
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.035619
0.043113
0.0383
2.70x
bioframe
0.102257
0.104425
0.103354
1.00x
pyranges0
0.025425
0.032821
0.028001
3.69x
pyranges1
0.059608
0.064147
0.061763
1.67x
pybedtools
0.343204
0.352804
0.348434
0.30x
genomicranges
1.042893
1.044245
1.043488
0.10x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.039943
0.045166
0.042109
4.45x
bioframe
0.185452
0.189631
0.187388
1.00x
pyranges0
0.092334
0.09634
0.093688
2.00x
pyranges1
0.133631
0.134179
0.133981
1.40x
pybedtools
0.756676
0.761866
0.75953
0.25x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.026706
0.029754
0.028142
4.69x
bioframe
0.131124
0.133729
0.132052
1.00x
pyranges0
0.039136
0.039774
0.039377
3.35x
pyranges1
0.061976
0.063181
0.062658
2.11x
pybedtools
0.665804
0.673844
0.668534
0.20x
genomicranges
0.994963
1.006435
0.999389
0.13x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.0262
0.028749
0.027418
6.30x
bioframe
0.16949
0.176628
0.172842
1.00x
pyranges0
0.07376
0.076708
0.075369
2.29x
pyranges1
0.128027
0.133263
0.130247
1.33x
pybedtools
0.701817
0.708726
0.705839
0.24x
genomicranges
1.032651
1.049059
1.040799
0.17x
8-7
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
3.987391
4.648581
4.235518
7.17x
bioframe
29.793837
30.991576
30.375518
1.00x
pyranges0
15.632212
15.974075
15.857213
1.92x
pyranges1
31.622804
33.699074
32.680701
0.93x
pybedtools
916.711575
919.974811
918.154834
0.03x
genomicranges
479.214112
487.832054
484.579554
0.06x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
2.116922
2.169534
2.139006
32.13x
bioframe
68.581465
68.992651
68.725495
1.00x
pyranges0
1.381964
1.508513
1.424446
48.25x
pyranges1
2.697684
2.728407
2.717532
25.29x
pybedtools
35.528719
35.876667
35.699544
1.93x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.445467
1.484052
1.46225
58.77x
bioframe
85.632767
86.26148
85.935955
1.00x
pyranges0
9.674847
9.833233
9.753982
8.81x
pyranges1
10.170249
10.254359
10.201813
8.42x
pybedtools
33.101592
33.966188
33.423595
2.57x
genomicranges
488.972732
490.395787
489.548184
0.18x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.195279
1.205765
1.199323
20.45x
bioframe
24.423391
24.682901
24.525909
1.00x
pyranges0
11.093644
11.328071
11.220416
2.19x
pyranges1
11.987003
12.147925
12.066045
2.03x
pybedtools
59.699275
60.04087
59.84965
0.41x
genomicranges
500.041974
503.31936
502.043072
0.05x
100-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.002471
0.006262
0.003855
0.54x
bioframe
0.001374
0.002735
0.002067
1.00x
pyranges0
0.000977
0.001952
0.001337
1.55x
pyranges1
0.002276
0.003591
0.002739
0.75x
pybedtools
0.006856
0.010064
0.008032
0.26x
genomicranges
0.001784
0.002115
0.001938
1.07x
pygenomics
0.000475
0.000541
0.000509
4.06x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.002802
0.007312
0.004371
0.51x
bioframe
0.00157
0.00347
0.002251
1.00x
pyranges0
0.00135
0.004085
0.002281
0.99x
pyranges1
0.002084
0.003622
0.002633
0.85x
pybedtools
0.005288
0.023073
0.011717
0.19x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.001892
0.006355
0.003397
0.52x
bioframe
0.001563
0.002165
0.001775
1.00x
pyranges1
0.00181
0.002209
0.001972
0.90x
pybedtools
0.020892
0.062978
0.036866
0.05x
genomicranges
0.001896
0.002057
0.001957
0.91x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.001911
0.006057
0.003343
1.03x
bioframe
0.003065
0.00411
0.003452
1.00x
pyranges1
0.004455
0.005845
0.005021
0.69x
pybedtools
0.02477
0.059532
0.037421
0.09x
1000-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.00262
0.004367
0.003278
0.71x
bioframe
0.001909
0.002988
0.002313
1.00x
pyranges0
0.001361
0.00182
0.001543
1.50x
pyranges1
0.002678
0.003166
0.002927
0.79x
pybedtools
0.037238
0.039737
0.038453
0.06x
genomicranges
0.019265
0.019945
0.01957
0.12x
pygenomics
0.006876
0.006994
0.006949
0.33x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.003048
0.0083
0.00553
0.65x
bioframe
0.003269
0.004119
0.003604
1.00x
pyranges0
0.002514
0.003506
0.003099
1.16x
pyranges1
0.003722
0.00418
0.003935
0.92x
pybedtools
0.00881
0.011281
0.009729
0.37x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.001854
0.004714
0.002898
1.00x
bioframe
0.002523
0.003547
0.002898
1.00x
pyranges1
0.002302
0.002838
0.002498
1.16x
pybedtools
0.032681
0.047822
0.037981
0.08x
genomicranges
0.020029
0.02029
0.020192
0.14x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.002202
0.003516
0.002696
1.77x
bioframe
0.004238
0.005691
0.004758
1.00x
pyranges1
0.004909
0.005934
0.005284
0.90x
pybedtools
0.030735
0.045004
0.03646
0.13x
10000-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.004603
0.008294
0.006073
1.81x
bioframe
0.010529
0.011367
0.011014
1.00x
pyranges0
0.006498
0.007306
0.006811
1.62x
pyranges1
0.01096
0.012611
0.011684
0.94x
pybedtools
0.94646
0.94995
0.948121
0.01x
genomicranges
0.198868
0.200266
0.199428
0.06x
pygenomics
0.080325
0.08121
0.080663
0.14x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.004851
0.007782
0.005908
4.50x
bioframe
0.025947
0.027779
0.026584
1.00x
pyranges0
0.00501
0.005703
0.00526
5.05x
pyranges1
0.007517
0.007937
0.00769
3.46x
pybedtools
0.040749
0.043864
0.041889
0.63x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.003283
0.008069
0.005083
3.12x
bioframe
0.014669
0.016689
0.015834
1.00x
pyranges1
0.007637
0.008979
0.008178
1.94x
pybedtools
0.720797
0.730655
0.725407
0.02x
genomicranges
0.202131
0.209398
0.204628
0.08x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.002756
0.004613
0.003377
3.06x
bioframe
0.009849
0.011243
0.010339
1.00x
pyranges1
0.01326
0.015308
0.013973
0.74x
pybedtools
0.727294
0.733098
0.73116
0.01x
100000-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.030583
0.038892
0.033394
3.33x
bioframe
0.108358
0.115233
0.111059
1.00x
pyranges0
0.059633
0.065599
0.061791
1.80x
pyranges1
0.100074
0.105947
0.102267
1.09x
pybedtools
13.434458
13.602339
13.496321
0.01x
genomicranges
2.030365
2.052434
2.039897
0.05x
pygenomics
1.001974
1.018231
1.009213
0.11x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.03013
0.036718
0.03339
10.61x
bioframe
0.352786
0.356839
0.354241
1.00x
pyranges0
0.032403
0.034701
0.033667
10.52x
pyranges1
0.044958
0.046169
0.045629
7.76x
pybedtools
0.369122
0.379131
0.3729
0.95x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.021035
0.026894
0.023802
13.86x
bioframe
0.308013
0.347919
0.329806
1.00x
pyranges1
0.076199
0.085019
0.079372
4.16x
pybedtools
11.056327
11.280248
11.149039
0.03x
genomicranges
2.057607
2.07651
2.067998
0.16x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.013263
0.014998
0.013874
5.68x
bioframe
0.077717
0.081116
0.078865
1.00x
pyranges1
0.094753
0.114552
0.10257
0.77x
pybedtools
11.374602
11.428316
11.393849
0.01x
1000000-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.482548
0.538737
0.507383
2.55x
bioframe
1.26082
1.35031
1.296195
1.00x
pyranges0
0.775969
0.828801
0.810501
1.60x
pyranges1
1.272326
1.29706
1.28585
1.01x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.439544
0.488414
0.458975
14.86x
bioframe
6.592501
7.111734
6.818208
1.00x
pyranges0
0.398173
0.413055
0.406623
16.77x
pyranges1
0.51649
0.520946
0.518407
13.15x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.257781
0.305275
0.28525
17.65x
bioframe
4.640915
5.437883
5.033454
1.00x
pyranges1
0.916714
0.925945
0.920594
5.47x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.128241
0.137198
0.132474
7.71x
bioframe
0.996542
1.065777
1.021738
1.00x
pyranges1
1.115134
1.247674
1.172964
0.87x
10000000-1p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
8.532137
9.738828
8.978132
2.20x
bioframe
19.276665
20.295566
19.708064
1.00x
pyranges0
14.819439
15.339048
15.092611
1.31x
pyranges1
20.153432
22.654892
21.56345
0.91x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
7.12779
7.490779
7.263011
22.17x
bioframe
156.356696
169.531002
160.989714
1.00x
pyranges0
6.402183
6.879779
6.62806
24.29x
pyranges1
7.526236
8.176338
7.857803
20.49x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
4.887937
5.553197
5.165014
20.21x
bioframe
102.637625
105.903506
104.389343
1.00x
pyranges1
13.35283
15.167609
14.19713
7.35x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.627897
1.683304
1.655288
9.86x
bioframe
15.586487
16.774274
16.316676
1.00x
pyranges1
16.99118
17.447484
17.195844
0.95x
gcp-linux
1-2
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.045943
0.064732
0.054234
1.66x
bioframe
0.084137
0.099481
0.090107
1.00x
pyranges0
0.056206
0.065654
0.061844
1.46x
pyranges1
0.09908
0.119018
0.106228
0.85x
pybedtools
0.38246
0.406379
0.39153
0.23x
genomicranges
1.19939
1.224621
1.208255
0.07x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.057012
0.073822
0.064665
2.49x
bioframe
0.158764
0.165707
0.161273
1.00x
pyranges0
0.172297
0.176259
0.17363
0.93x
pyranges1
0.217619
0.234088
0.22335
0.72x
pybedtools
0.845945
0.84898
0.847447
0.19x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.035631
0.043555
0.04066
2.74x
bioframe
0.108015
0.116522
0.111266
1.00x
pyranges0
0.077336
0.080282
0.07844
1.42x
pyranges1
0.100883
0.106671
0.103181
1.08x
pybedtools
0.745958
0.759006
0.754393
0.15x
genomicranges
1.154942
1.164158
1.158506
0.10x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.036476
0.040001
0.037897
5.10x
bioframe
0.189201
0.20046
0.193401
1.00x
pyranges0
0.141659
0.14424
0.143188
1.35x
pyranges1
0.206033
0.224902
0.213089
0.91x
pybedtools
0.773732
0.780424
0.776934
0.25x
genomicranges
1.186341
1.194172
1.189255
0.16x
8-7
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
6.235223
9.61441
7.723144
6.54x
bioframe
50.319263
50.956633
50.537202
1.00x
pyranges0
36.371926
36.581642
36.448645
1.39x
pyranges1
63.336711
63.455435
63.40654
0.80x
pybedtools
1149.001487
1152.127068
1150.070659
0.04x
genomicranges
597.951648
599.960895
599.002871
0.08x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
3.576373
3.679698
3.633697
15.54x
bioframe
56.301865
56.776617
56.464305
1.00x
pyranges0
2.45308
2.60494
2.505172
22.54x
pyranges1
4.975662
5.011008
4.997007
11.30x
pybedtools
44.181913
44.79409
44.386971
1.27x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
2.052196
2.104447
2.075706
38.15x
bioframe
79.174164
79.234115
79.194209
1.00x
pyranges0
18.797436
18.851941
18.824498
4.21x
pyranges1
20.399172
20.436149
20.418562
3.88x
pybedtools
35.850631
36.142479
36.041115
2.20x
genomicranges
612.985873
613.52087
613.229997
0.13x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.829478
1.838981
1.834999
15.44x
bioframe
28.29136
28.361417
28.326821
1.00x
pyranges0
18.611247
20.021441
19.473105
1.45x
pyranges1
22.118838
22.210733
22.161329
1.28x
pybedtools
74.477086
74.868659
74.618066
0.38x
genomicranges
623.865655
623.94955
623.896645
0.05x
100-1p
overlap
nearest
count-overlaps
coverage
1000-1p
overlap
nearest
count-overlaps
coverage
10000-1p
overlap
nearest
count-overlaps
coverage
100000-1p
overlap
nearest
count-overlaps
coverage
1000000-1p
overlap
nearest
count-overlaps
coverage
10000000-1p
overlap
nearest
count-overlaps
coverage
Parallel performance
Results for parallel operations with 1, 2, 4, 6 and 8 threads.
apple-m3-max
8-7-8p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
3.247022
3.803021
3.370889
1.00x
polars_bio-2
1.798569
1.848162
1.811417
1.86x
polars_bio-4
1.140229
1.158243
1.147355
2.94x
polars_bio-6
0.959703
0.968725
0.962915
3.50x
polars_bio-8
0.694637
0.710492
0.701048
4.81x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
2.186354
2.248171
2.220822
1.00x
polars_bio-2
1.162969
1.222115
1.187505
1.87x
polars_bio-4
0.708508
0.735763
0.720115
3.08x
polars_bio-6
0.632877
0.652955
0.642816
3.45x
polars_bio-8
0.456674
0.476473
0.465284
4.77x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.502551
1.534006
1.515078
1.00x
polars_bio-2
0.811236
0.821365
0.815682
1.86x
polars_bio-4
0.440628
0.46778
0.455358
3.33x
polars_bio-6
0.331317
0.338207
0.334638
4.53x
polars_bio-8
0.280465
0.282707
0.281311
5.39x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.181806
1.185549
1.183889
1.00x
polars_bio-2
0.644288
0.645076
0.644587
1.84x
polars_bio-4
0.362752
0.363411
0.363036
3.26x
polars_bio-6
0.258583
0.272702
0.264111
4.48x
polars_bio-8
0.222888
0.234884
0.229052
5.17x
1000000-8p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.468442
0.523065
0.494609
1.00x
polars_bio-2
0.262861
0.26828
0.265028
1.87x
polars_bio-4
0.1629
0.166657
0.164536
3.01x
polars_bio-6
0.137724
0.146893
0.143772
3.44x
polars_bio-8
0.111952
0.11465
0.113521
4.36x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.393067
0.415076
0.404032
1.00x
polars_bio-2
0.234559
0.235746
0.235051
1.72x
polars_bio-4
0.158996
0.167352
0.16349
2.47x
polars_bio-6
0.14634
0.14935
0.148215
2.73x
polars_bio-8
0.125472
0.128158
0.126606
3.19x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.267875
0.296727
0.277677
1.00x
polars_bio-2
0.163662
0.170045
0.165917
1.67x
polars_bio-4
0.111136
0.114835
0.112891
2.46x
polars_bio-6
0.097944
0.104607
0.101477
2.74x
polars_bio-8
0.099474
0.117493
0.106059
2.62x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
0.128377
0.131261
0.129598
1.00x
polars_bio-2
0.081762
0.085104
0.08324
1.56x
polars_bio-4
0.064151
0.066197
0.064851
2.00x
polars_bio-6
0.066926
0.06892
0.06768
1.91x
polars_bio-8
0.072767
0.074339
0.073589
1.76x
10000000-8p
overlap
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
9.081732
9.388126
9.203018
1.00x
polars_bio-2
4.696455
4.912478
4.793254
1.92x
polars_bio-4
2.885023
2.902893
2.896218
3.18x
polars_bio-6
2.196605
2.217945
2.209839
4.16x
polars_bio-8
1.813586
1.860947
1.833498
5.02x
nearest
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
7.299887
7.659385
7.495962
1.00x
polars_bio-2
4.01928
4.158504
4.069511
1.84x
polars_bio-4
2.683383
2.720981
2.704975
2.77x
polars_bio-6
2.141075
2.162109
2.150595
3.49x
polars_bio-8
1.859186
1.865634
1.862653
4.02x
count-overlaps
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
5.30938
5.450332
5.381068
1.00x
polars_bio-2
2.893766
2.91378
2.906401
1.85x
polars_bio-4
1.748771
1.797485
1.768895
3.04x
polars_bio-6
1.352671
1.385655
1.369312
3.93x
polars_bio-8
1.178559
1.199971
1.192577
4.51x
coverage
Library
Min (s)
Max (s)
Mean (s)
Speedup
polars_bio
1.638818
1.678156
1.655573
1.00x
polars_bio-2
0.994195
0.996554
0.995701
1.66x
polars_bio-4
0.678722
0.701234
0.689151
2.40x
polars_bio-6
0.620289
0.662175
0.639026
2.59x
polars_bio-8
0.570659
0.582937
0.57688
2.87x
gcp-linux
8-7-8p
overlap
nearest
count-overlaps
coverage
1000000-8p
overlap
nearest
count-overlaps
coverage
10000000-8p
overlap
nearest
count-overlaps
coverage
End to end tests
Results for an end-to-end test with calculating overlaps, nearest, coverage and count overlaps and saving results to a CSV file.
Note
Please note that in case of pyranges0 we were unable to export the results of coverage and count-overlaps operations to a CSV file, so the results are not presented here.
apple-m3-max
1-2
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.042378
0.130957
0.071929
3.10x
285.468
polars_bio_streaming
0.035498
0.037438
0.036653
6.09x
274.093
bioframe
0.208548
0.251457
0.223219
1.00x
300.75
pyranges0
0.409707
0.415361
0.412135
0.54x
329.968
pyranges1
0.47518
0.491508
0.482739
0.46x
324.468
e2e-nearest-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.053349
0.058382
0.055362
9.14x
321.062
polars_bio_streaming
0.051385
0.053979
0.052764
9.59x
311.422
bioframe
0.503887
0.510257
0.506123
1.00x
316.969
pyranges0
1.135469
1.183369
1.151801
0.44x
364.594
pyranges1
1.327935
1.334101
1.331346
0.38x
357.734
e2e-coverage-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.034756
0.038718
0.036421
13.40x
290.078
polars_bio_streaming
0.03607
0.037332
0.036534
13.35x
274.344
bioframe
0.48449
0.492328
0.487891
1.00x
419.312
pyranges1
0.971084
0.980085
0.975012
0.50x
407.562
e2e-count-overlaps-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.03452
0.037714
0.035927
9.27x
294.266
polars_bio_streaming
0.035863
0.036756
0.036414
9.14x
278.438
bioframe
0.328145
0.338734
0.332951
1.00x
306.234
pyranges1
0.532739
0.544914
0.538646
0.62x
328.328
8-7
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
22.781745
23.916568
23.161559
16.64x
14677.0468
polars_bio_streaming
18.501279
18.797602
18.676707
20.63x
555.109
bioframe
383.108514
387.500069
385.309331
1.00x
33806.062
pyranges0
276.421312
279.839508
277.845198
1.39x
29777.312
pyranges1
355.703878
367.680249
360.875151
1.07x
34526.859
e2e-nearest-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
2.597955
2.760537
2.674482
32.02x
1060.031
polars_bio_streaming
2.65088
2.685157
2.665171
32.13x
560.453
bioframe
85.238305
86.131916
85.644961
1.00x
6894.062
pyranges0
13.530549
13.705834
13.620471
6.29x
3031.797
pyranges1
16.290782
16.385961
16.322671
5.25x
3509.984
e2e-coverage-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
1.523833
1.555472
1.541038
21.41x
717.984
polars_bio_streaming
1.336613
1.397324
1.364051
24.19x
411.703
bioframe
32.294844
33.421618
32.99334
1.00x
16651.922
pyranges1
26.382409
27.382901
27.020202
1.22x
6119.125
e2e-count-overlaps-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
1.806838
1.845584
1.82594
54.33x
729.078
polars_bio_streaming
1.681187
1.767811
1.714943
57.85x
416.094
bioframe
97.91802
101.736351
99.210461
1.00x
23029.219
pyranges1
19.498264
19.676838
19.561322
5.07x
5270.234
100-1p
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.009118
0.077181
0.032054
1.55x
248.594
polars_bio_streaming
0.003382
0.004769
0.003853
12.92x
247.562
bioframe
0.030154
0.088667
0.049769
1.00x
231.641
pyranges0
0.045764
0.051035
0.047857
1.04x
228.516
pyranges1
0.053751
0.072545
0.060221
0.83x
228.609
e2e-nearest-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.009145
0.038799
0.019201
2.24x
253.156
polars_bio_streaming
0.003964
0.005051
0.004504
9.53x
248.188
bioframe
0.033372
0.061107
0.042931
1.00x
229.906
pyranges0
0.049586
0.057381
0.052364
0.82x
231.812
pyranges1
0.054496
0.059205
0.056362
0.76x
231.688
e2e-coverage-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.005492
0.020652
0.012584
5.20x
245.578
polars_bio_streaming
0.003059
0.003746
0.003397
19.25x
243.5
bioframe
0.060684
0.074157
0.065378
1.00x
230.953
pyranges1
0.093668
0.096265
0.094567
0.69x
243.5
e2e-count-overlaps-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.005291
0.008843
0.006568
5.53x
249.406
polars_bio_streaming
0.003279
0.003697
0.003447
10.53x
245.672
bioframe
0.032914
0.042309
0.036302
1.00x
234.141
pyranges1
0.045085
0.045477
0.045224
0.80x
232.703
10000000-1p
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
11.109423
11.871893
11.397992
10.38x
7064.312
polars_bio_streaming
12.049206
12.327491
12.191582
9.71x
1505.109
bioframe
117.701516
119.51073
118.356016
1.00x
16380.234
pyranges0
235.484308
243.216406
239.726101
0.49x
14245.203
pyranges1
109.722359
112.326873
111.23273
1.06x
19423.172
e2e-nearest-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
7.842314
8.84828
8.510181
21.04x
2301.0
polars_bio_streaming
7.589706
8.153016
7.842404
22.83x
1327.531
bioframe
174.790383
183.458906
179.035999
1.00x
10996.234
pyranges0
32.793505
32.826686
32.809101
5.46x
4882.656
pyranges1
18.866156
19.570609
19.142653
9.35x
5253.281
e2e-coverage-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
1.901833
1.957711
1.928367
12.15x
956.844
polars_bio_streaming
1.797332
1.802527
1.800497
13.01x
651.266
bioframe
23.269774
23.55838
23.430125
1.00x
6493.234
pyranges1
26.370249
27.172173
26.879266
0.87x
10397.531
e2e-count-overlaps-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
5.025462
5.234103
5.129963
20.79x
1036.734
polars_bio_streaming
4.956087
5.076052
5.014242
21.27x
968.719
bioframe
105.322287
107.758078
106.64158
1.00x
12803.828
pyranges1
22.079391
23.069931
22.618209
4.71x
10039.297
gcp-linux
1-2
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
0.072393
0.151871
0.09916
2.80x
314.234
polars_bio_streaming
0.064092
0.067914
0.066202
4.19x
288.621
bioframe
0.258278
0.31288
0.277225
1.00x
287.101
pyranges0
0.591745
0.599954
0.595204
0.47x
307.218
pyranges1
0.683388
0.702289
0.690362
0.40x
327.863
e2e-nearest-csv
e2e-coverage-csv
e2e-count-overlaps-csv
8-7
e2e-overlap-csv
Library
Min (s)
Max (s)
Mean (s)
Speedup
Peak memory (MB)
polars_bio
44.539766
45.543038
45.196903
12.55x
14575.14
polars_bio_streaming
34.007093
35.972075
35.309756
16.06x
480.207
bioframe
566.167037
567.617695
567.13069
1.00x
43295.378
pyranges0
417.291061
421.875539
419.571591
1.35x
22915.917
pyranges1
538.365637
548.624613
543.918168
1.04x
43408.699
e2e-nearest-csv
e2e-coverage-csv
e2e-count-overlaps-csv
100-1p
e2e-overlap-csv
e2e-nearest-csv
e2e-coverage-csv
e2e-count-overlaps-csv
10000000-1p
e2e-overlap-csv
e2e-nearest-csv
e2e-coverage-csv
e2e-count-overlaps-csv
Memory profiles
Operation: overlap for dataset: 1-2 on platform: apple-m3-max
Operation: nearest for dataset: 1-2 on platform: apple-m3-max
Operation: coverage for dataset: 1-2 on platform: apple-m3-max
Operation: count-overlaps for dataset: 1-2 on platform: apple-m3-max
Operation: overlap for dataset: 8-7 on platform: apple-m3-max
Operation: nearest for dataset: 8-7 on platform: apple-m3-max
Operation: coverage for dataset: 8-7 on platform: apple-m3-max
Operation: count-overlaps for dataset: 8-7 on platform: apple-m3-max
Operation: overlap for dataset: 100-1p on platform: apple-m3-max
Operation: nearest for dataset: 100-1p on platform: apple-m3-max
Operation: coverage for dataset: 100-1p on platform: apple-m3-max
Operation: count-overlaps for dataset: 100-1p on platform: apple-m3-max
Operation: overlap for dataset: 10000000-1p on platform: apple-m3-max
Operation: nearest for dataset: 10000000-1p on platform: apple-m3-max
Operation: coverage for dataset: 10000000-1p on platform: apple-m3-max
Operation: count-overlaps for dataset: 10000000-1p on platform: apple-m3-max
Operation: overlap for dataset: 1-2 on platform: gcp-linux
Operation: nearest for dataset: 1-2 on platform: gcp-linux
Operation: coverage for dataset: 1-2 on platform: gcp-linux
Operation: count-overlaps for dataset: 1-2 on platform: gcp-linux
Operation: overlap for dataset: 8-7 on platform: gcp-linux
Operation: nearest for dataset: 8-7 on platform: gcp-linux
Operation: coverage for dataset: 8-7 on platform: gcp-linux
Operation: count-overlaps for dataset: 8-7 on platform: gcp-linux
Operation: overlap for dataset: 100-1p on platform: gcp-linux
Operation: nearest for dataset: 100-1p on platform: gcp-linux
Operation: coverage for dataset: 100-1p on platform: gcp-linux
Operation: count-overlaps for dataset: 100-1p on platform: gcp-linux
Operation: overlap for dataset: 10000000-1p on platform: gcp-linux
Operation: nearest for dataset: 10000000-1p on platform: gcp-linux
Operation: coverage for dataset: 10000000-1p on platform: gcp-linux
Operation: count-overlaps for dataset: 10000000-1p on platform: gcp-linux
Comparison of the output schemas and data types
polars-bio tries to preserve the output schema of the bioframe package, pyranges uses its own internal representation that can be converted to a Pandas dataframe. It is also worth mentioning that pyranges always uses int64 for start/end positions representation (polars-bio and bioframe determine it adaptively based on the input file formats/DataFrames datatypes used. polars-bio does not support interval operations on chromosomes longer than 2Gp(issue)). However, in the analyzed test case (8-7) input/output data structures have similar memory requirements.
Please compare the following schema and memory size estimates of the input and output DataFrames for 8-7 test case:
Please note that pyranges unlike bioframe and polars-bio returns only one chromosome column but uses int64 data types for encoding start and end positions even if input datasets use int32.