One of the main side effects of my recent work on Printrun is that we now store the parsed G-code instead of the plain text one. I initially wrote and tested these changes on machines which were not memory tight (with either 8 or 16GB of RAM) and did not initially foresee that I was creating a huge memory hog. Indeed, instead of just storing the G-code commands as a list of strings, we now store a list of Python objects with various attributes, the raw G-code string being just one of them. Loading a 3.8MB file with about 140000 G-code lines lead to about 244MB of memory usage just for the parsed G-code, which is definitely not right (especially since some users have very large G-code files, more than 20 times bigger than the one I’m using).
Let’s look at how we can find out where we are using memory and what we can do to improve the memory usage.
Understanding the data structures
We basically have a large batch of Line objects. Each Line object is basically a storage hub for a single G-code command and the attributes we parsed from it: the command (as a string), X/Y/Z/E/F G-code arguments and absolute X/Y/Z position and a few meta-information, such as whether the coordinates were imperial, whether the move was relative or whether the command is going to involve the extruder.
These lines are accessed through three main paths: a list which references all the lines, a dictionary of Layer objects referencing all the G-codes started from a given Z (these Z are the layer indices) and a list of Layer objects, with one Layer object for each Z change (so a head lift during a move would create two new layers, the following commands going into the last created layer).
Profiling memory usage of Python code
I used two main approaches to profile the code: the first one is simply to use the sys.getsizeof function from Python standard library, which gives you the size of the object you provide it (not including the size of objects it references but including the size of the garbage collector metadata overhead), while the second one (which is actually the one I used the most, I only used the first one at a very late stage) is the memory inspector Heapy.
Heapy is very easy to use: just install guppy (the programming environment that provides Heapy), and then:
from guppy import hpy h = hpy() h.setref() # the code you want to memory-profile print h.heap() |
Which should print something like (and this report is the starting point of our adventure, based on commit de02427):
Partition of a set of 2030760 objects. Total size = 256042336 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 7 146214528 57 146214528 57 dict of printrun.gcoder.Line 1 838522 41 43666288 17 189880816 74 str 2 140075 7 23991064 9 213871880 84 list 3 279047 14 21207472 8 235079352 92 tuple 4 419904 21 10077696 4 245157048 96 float 5 139536 7 8930304 3 254087352 99 printrun.gcoder.Line 6 73040 4 1752960 1 255840312 100 int 7 536 0 150080 0 255990392 100 dict of printrun.gcoder.Layer 8 536 0 34304 0 256024696 100 printrun.gcoder.Layer 9 4 0 13408 0 256038104 100 dict (no owner) |
What we can see here is that 57% of the memory is used for the dictionaries that hold the Line objects attributes, 17% for storing strings, 9 for lists, 8 for tuples, 4 for floats. and 3 for the actual Line objects and 1 for integer objects.
Getting rid of temporary data
My first shot (which was even before using Heapy, so it might sound a little silly) was to drop data which had a limited lifetime. We were indeed storing the result of splitting the G-code string using a complex regular expression until the point where we consumed all the information from this split to parse the attributes of the command. Thus in 86bf8ca we started dropping this subresult (which was basically a tuple holding a bunch of strings) after using it, and reduced memory consumption by about 14%, down to 200MB:
Partition of a set of 1331825 objects. Total size = 210155496 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 10 146214528 70 146214528 70 dict of printrun.gcoder.Line 1 279047 21 21207472 10 167422000 80 tuple 2 279103 21 16753624 8 184175624 88 str 3 419904 32 10077696 5 194253320 92 float 4 139536 10 8930304 4 203183624 97 printrun.gcoder.Line 5 559 0 5016888 2 208200512 99 list 6 73040 5 1752960 1 209953472 100 int 7 536 0 150080 0 210103552 100 dict of printrun.gcoder.Layer 8 536 0 34304 0 210137856 100 printrun.gcoder.Layer 9 4 0 13408 0 210151264 100 dict (no owner) |
The __slots__ magic
From there since I was not very sure of where to cut costs, I started using Heapy and figured out that most of our memory (70%) was being used by the dynamic dictionaries used to store Line objects attributes. This is an expected downside of Python’s flexibility, as you can add new attributes on the fly at any time, so it can’t just allocate a static amount of memory at object creation to store all the attributes. This is where the __slots__ class variable comes to the rescue: it lets you specify which attributes will ever be used on this object. We thus implemented this trick in 2a6eec7, further reducing memory consumption on our test file to 77MB, which is 62% less memory than before.
Partition of a set of 1192289 objects. Total size = 80685288 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 12 25674624 32 25674624 32 printrun.gcoder.Line 1 279047 23 21207472 26 46882096 58 tuple 2 279103 23 16753624 21 63635720 79 str 3 419904 35 10077696 12 73713416 91 float 4 559 0 5016888 6 78730304 98 list 5 73040 6 1752960 2 80483264 100 int 6 536 0 150080 0 80633344 100 dict of printrun.gcoder.Layer 7 536 0 34304 0 80667648 100 printrun.gcoder.Layer 8 4 0 13408 0 80681056 100 dict (no owner) 9 4 0 1120 0 80682176 100 dict of guppy.etc.Glue.Interface |
Replacing lists of tuples with arrays
At this step the memory usage of Line objects felt pretty much right, still too high but back to a reasonable ratio compared to the other data structures. However, the costs of tuple objects now looks a little bit too high. We use tuples in two places, the first one being to store the current absolute X/Y/Z position at the end of each G-code line, the other being in a big list which is used to map between the line number and the layer number & position in this layer line list. The later can be easily replaced by two lists holding the same information without the tuples, or by two Python array, which are very compact storage arrays for standard types (characters/integers/floating point numbers) and thus have much less overhead than a list holding integer objects. Switching to integer arrays in 5e1a0be brought us down to 65.5MB, a new 15% improvement.
Removing more tuples
As explained in the previous part, we also had tuples storing the destination absolute position of each G-code line, which we simply split into three different properties in b27cb37, reducing memory consumption down to 57MB, a new 13% improvement.
Going C-style
So, here we are:
Partition of a set of 840204 objects. Total size = 59763064 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 17 27907200 47 27907200 47 printrun.gcoder.Line 1 279103 33 16753624 28 44660824 75 str 2 419904 50 10077696 17 54738520 92 float 3 558 0 3706096 6 58444616 98 list 4 2 0 1116400 2 59561016 100 array.array 5 536 0 150080 0 59711096 100 dict of printrun.gcoder.Layer 6 536 0 34304 0 59745400 100 printrun.gcoder.Layer 7 4 0 13408 0 59758808 100 dict (no owner) 8 4 0 1120 0 59759928 100 dict of guppy.etc.Glue.Interface 9 1 0 1048 0 59760976 100 dict of printrun.gcoder.GCode |
47% of the memory is used by the Line objects, 28 by string objects and 17 by float objects. The strings take 16MB in memory (while we only have 3.8MB of input data, that’s a x4 factor, which is probably due to the fact that sys.getsizeof("") reports 40 bytes overhead per str objects), and each float takes 24 bytes, while a double should take no more than 8 bytes.
The solution we used for getting rid of all the overheads is to provide an extension module written in C, using Cython, which would store all the properties in a C structure. However, we had to take care of two issues: the first one was the need for a fallback mode if the Cython module isn’t available (so we had to keep the previous Python-based Line implementation and derive from either this PyLine or the Cython-based gcoder_line.Line object), while the second one was that Python properties could be either None or some value of the type we wanted, and that we use this behavior to detect whether an attribute was present in the G-code line or not (that is, if after parsing the command of an object called line the line.x attribute is None, then there was no X argument in the command). The solution to this second requirement is to store for each property an extra boolean telling wether the property was ever set or not, so that when the property is accessed we can return None if no value was set.
Let’s talk about implementation now. I first tried to implement argument access in a Pythonic way, hacking __setattr__ and __getattr__ to do the extra None-checking, as can be seen in a3b27af, but this solution was much slower (3 times slower) than the original Python code, though much lighter, down to 26MB, a new 54% improvement (though we lost track of some memory here, see below):
Partition of a set of 141504 objects. Total size = 27358912 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 99 22325760 82 22325760 82 printrun.gcoder.Line [...] |
The solution to this slowdown, implemented in a511f20, was simply to use the properties mechanism from Cython, which lets you write getters and setters for a given attribute, which get compiled into very fast C code, while using the same amount of memory per-instance.
The next improvement was to compress all the boolean properties specifying whether a variable had been set into a single bitarray stored as a 4 bytes (32 bits) integer, which reduced the per-instance memory usage from 160 bytes to 144 (1c41072). We extended this trick to all the boolean properties in fc43d55, further reducing per-instance cost to 136 bytes.
Getting rid of the cyclic garbage collector
Though memory consumption had already been trimmed down a lot, I was a little bit confused because the Line objects were still taking 136 bytes, while the actual properties I had declared should have been taking about 97 bytes, and doing a manual sizeof on the C struct which was being compiled announced a size of 104 bytes (the 7 bytes difference is just a matter of memory padding). After looking around a lot and using the standard “get rid of one line of code at a time until things start making sense”, I figured out the 32 extra bytes were brought by the fact that one of my properties was a Python object (the tuple of strings from the G-code splitter regexp), and that referencing a Python triggered the addition of cyclic garbage collection metadata. Fine, I refactored the code to not store this information at all. The Cython-based Line objects then started being the right size, but the final Line objects (the ones which inherited from either the PyLine or Cython-based gcoder_line.Line) were still 32 bytes too large, even though they had no more memory __slots__ than their ancestors (so no other property than the parent’s could reference other Python objects, and as the Cython parent could not, the child class should not be able to reference other Python objects).
After investigating a lot more, I figured out that while the same derivation worked fine with new-style Python objects (the ones which derive from object class), the Cython objects were a little bit different and that deriving them would always pull in this GC metadata. I thus rewrote the code to not require the derivation (by moving the previous class methods to module methods) and after 77888d9 we had memory consumption down to 17.6MB on our test object.
After a few more tricks and fine tuning of the properties list (we got rid of super rare properties), the size of Line objects was reduced to 80 bytes, and the global memory consumption down to 15.3MB:
Partition of a set of 140384 objects. Total size = 16011448 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 99 11162880 70 11162880 70 printrun.gcoder_line.GLine 1 536 0 2394240 15 13557120 85 printrun.gcoder.Layer 2 2 0 1313424 8 14870544 93 list 3 2 0 1116400 7 15986944 100 array.array 4 4 0 13408 0 16000352 100 dict (no owner) 5 275 0 6600 0 16006952 100 float 6 4 0 1120 0 16008072 100 dict of guppy.etc.Glue.Interface 7 1 0 1048 0 16009120 100 dict of printrun.gcoder.GCode 8 8 0 704 0 16009824 100 __builtin__.weakref 9 1 0 448 0 16010272 100 types.FrameType |
Adding Heapy support for Cython objects
However, the numbers I announced in the previous section are wrong. Indeed, one big issue with using Cython is that Heapy can’t automagically follow C pointers, and as we store the raw G-code string and G-code command (the “G1” or “M80” part for instance) as C strings, Heapy lost track of them, and we are using more memory than Heapy claims.
Luckily enough, Heapy provides a mechanism to specify a function which will take an object of the given type as an argument and will return the space used by the object and its properties, as documented on http://www.smira.ru/wp-content/uploads/2011/08/heapy.html
The only trick there was that the C code is only a subproduct of the Cython extension build process, and that I couldn’t add raw C-code to the Cython source, so that I had to write a patch which can be applied to the C source after it has been initially generated (the patch is available in 040b89f).
After this patch, we now have a correct estimation of the memory usage, and we have effectively reduced memory consumption from 244MB down to 19.4MB, which is a reduction factor of 12.5. We still use 5 times more memory than the initial file we load, but given how much metadata we store, it feels quite decent. I don’t expect anyone to load a G-code so big than 5 times the size would go over the memory limits of the person’s system.
Partition of a set of 140383 objects. Total size = 20369977 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 139536 99 15521497 76 15521497 76 printrun.gcoder_line.GLine 1 536 0 2394240 12 17915737 88 printrun.gcoder.Layer 2 2 0 1313424 6 19229161 94 list 3 2 0 1116400 5 20345561 100 array.array 4 4 0 13408 0 20358969 100 dict (no owner) 5 275 0 6600 0 20365569 100 float 6 4 0 1120 0 20366689 100 dict of guppy.etc.Glue.Interface 7 1 0 1048 0 20367737 100 dict of printrun.gcoder.GCode 8 7 0 616 0 20368353 100 __builtin__.weakref 9 1 0 448 0 20368801 100 types.FrameType |
So, long story short, Heapy is a very cool tool to profile the memory usage of Python code, Cython can help reducing your memory usage by letting you write extension modules very easily and that you should not forget about Python’s cyclic garbage collector when investigating your objects’ memory usage.
February 4th, 2022 at 4:18 pm
vegas grand slots https://slotmachinegameinfo.com/
February 10th, 2022 at 7:43 pm
writing results for dissertation https://buydissertationhelp.com/
February 11th, 2022 at 3:24 pm
cheap dissertation writing help https://dissertationwriting-service.com/
February 11th, 2022 at 6:56 pm
writing dissertation https://help-with-dissertations.com/
February 12th, 2022 at 12:37 am
dissertation thesis writing https://mydissertationwritinghelp.com/
February 12th, 2022 at 5:15 am
dissertation help services forum https://dissertations-writing.org/
February 12th, 2022 at 8:19 am
cheap dissertation help in atlanta https://helpon-doctoral-dissertations.net/
March 4th, 2022 at 6:47 am
Buy Proxy Ipv6
I found a great…
March 7th, 2022 at 2:12 am
Rickey Weathersbee
I found a great…
March 9th, 2022 at 1:30 pm
Chad Bushard
I found a great…
March 12th, 2022 at 2:31 am
Buy Datacenter Proxies
I found a great…
March 14th, 2022 at 10:10 pm
Hosea Kamuda
I found a great…
March 16th, 2022 at 6:26 pm
Quality Private Proxy
I found a great…
March 17th, 2022 at 4:13 am
Jacinta Voedisch
I found a great…
April 8th, 2022 at 9:59 am
Cheap Proxy Shop
I found a great…
April 8th, 2022 at 2:43 pm
Proxies For Xrumer
I found a great…
January 27th, 2023 at 10:01 am
2shotgun
February 5th, 2023 at 2:02 pm
coursework plagiarism checker https://courseworkninja.com/