Skip to content

Commit 4a511bf

Browse files
Forward-port generational GC.
Co-Authored-By: Neil Schemenauer <nas@arctrix.com> Co-Authored-By: Zanie Blue <contact@zanie.dev> Co-Authored-By: Sergey Miryanov <sergey.miryanov@gmail.com>
1 parent 7ce737e commit 4a511bf

File tree

14 files changed

+541
-1168
lines changed

14 files changed

+541
-1168
lines changed

Doc/library/gc.rst

Lines changed: 15 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -37,18 +37,11 @@ The :mod:`!gc` module provides the following functions:
3737

3838
.. function:: collect(generation=2)
3939

40-
Perform a collection. The optional argument *generation*
40+
With no arguments, run a full collection. The optional argument *generation*
4141
may be an integer specifying which generation to collect (from 0 to 2). A
4242
:exc:`ValueError` is raised if the generation number is invalid. The sum of
4343
collected objects and uncollectable objects is returned.
4444

45-
Calling ``gc.collect(0)`` will perform a GC collection on the young generation.
46-
47-
Calling ``gc.collect(1)`` will perform a GC collection on the young generation
48-
and an increment of the old generation.
49-
50-
Calling ``gc.collect(2)`` or ``gc.collect()`` performs a full collection
51-
5245
The free lists maintained for a number of built-in types are cleared
5346
whenever a full collection or collection of the highest generation (2)
5447
is run. Not all items in some free lists may be freed due to the
@@ -57,9 +50,6 @@ The :mod:`!gc` module provides the following functions:
5750
The effect of calling ``gc.collect()`` while the interpreter is already
5851
performing a collection is undefined.
5952

60-
.. versionchanged:: 3.14
61-
``generation=1`` performs an increment of collection.
62-
6353

6454
.. function:: set_debug(flags)
6555

@@ -75,20 +65,13 @@ The :mod:`!gc` module provides the following functions:
7565

7666
.. function:: get_objects(generation=None)
7767

78-
7968
Returns a list of all objects tracked by the collector, excluding the list
80-
returned. If *generation* is not ``None``, return only the objects as follows:
81-
82-
* 0: All objects in the young generation
83-
* 1: No objects, as there is no generation 1 (as of Python 3.14)
84-
* 2: All objects in the old generation
69+
returned. If *generation* is not ``None``, return only the objects tracked by
70+
the collector that are in that generation.
8571

8672
.. versionchanged:: 3.8
8773
New *generation* parameter.
8874

89-
.. versionchanged:: 3.14
90-
Generation 1 is removed
91-
9275
.. audit-event:: gc.get_objects generation gc.get_objects
9376

9477
.. function:: get_stats()
@@ -124,32 +107,26 @@ The :mod:`!gc` module provides the following functions:
124107
Set the garbage collection thresholds (the collection frequency). Setting
125108
*threshold0* to zero disables collection.
126109

127-
The GC classifies objects into two generations depending on whether they have
128-
survived a collection. New objects are placed in the young generation. If an
129-
object survives a collection it is moved into the old generation.
130-
131-
In order to decide when to run, the collector keeps track of the number of object
110+
The GC classifies objects into three generations depending on how many
111+
collection sweeps they have survived. New objects are placed in the youngest
112+
generation (generation ``0``). If an object survives a collection it is moved
113+
into the next older generation. Since generation ``2`` is the oldest
114+
generation, objects in that generation remain there after a collection. In
115+
order to decide when to run, the collector keeps track of the number object
132116
allocations and deallocations since the last collection. When the number of
133117
allocations minus the number of deallocations exceeds *threshold0*, collection
134-
starts. For each collection, all the objects in the young generation and some
135-
fraction of the old generation is collected.
118+
starts. Initially only generation ``0`` is examined. If generation ``0`` has
119+
been examined more than *threshold1* times since generation ``1`` has been
120+
examined, then generation ``1`` is examined as well.
121+
With the third generation, things are a bit more complicated,
122+
see `Collecting the oldest generation <https://github.com/python/cpython/blob/ff0ef0a54bef26fc507fbf9b7a6009eb7d3f17f5/InternalDocs/garbage_collector.md#collecting-the-oldest-generation>`_ for more information.
136123

137124
In the free-threaded build, the increase in process memory usage is also
138125
checked before running the collector. If the memory usage has not increased
139126
by 10% since the last collection and the net number of object allocations
140127
has not exceeded 40 times *threshold0*, the collection is not run.
141128

142-
The fraction of the old generation that is collected is **inversely** proportional
143-
to *threshold1*. The larger *threshold1* is, the slower objects in the old generation
144-
are collected.
145-
For the default value of 10, 1% of the old generation is scanned during each collection.
146-
147-
*threshold2* is ignored.
148-
149-
See `Garbage collector design <https://devguide.python.org/garbage_collector>`_ for more information.
150-
151-
.. versionchanged:: 3.14
152-
*threshold2* is ignored
129+
See `Garbage collector design <https://github.com/python/cpython/blob/3.15/InternalDocs/garbage_collector.md>`_ for more information.
153130

154131

155132
.. function:: get_count()

Include/internal/pycore_debug_offsets.h

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -222,8 +222,6 @@ typedef struct _Py_DebugOffsets {
222222
uint64_t size;
223223
uint64_t collecting;
224224
uint64_t frame;
225-
uint64_t generation_stats_size;
226-
uint64_t generation_stats;
227225
} gc;
228226

229227
// Generator object offset;
@@ -375,8 +373,6 @@ typedef struct _Py_DebugOffsets {
375373
.size = sizeof(struct _gc_runtime_state), \
376374
.collecting = offsetof(struct _gc_runtime_state, collecting), \
377375
.frame = offsetof(struct _gc_runtime_state, frame), \
378-
.generation_stats_size = sizeof(struct gc_stats), \
379-
.generation_stats = offsetof(struct _gc_runtime_state, generation_stats), \
380376
}, \
381377
.gen_object = { \
382378
.size = sizeof(PyGenObject), \

Include/internal/pycore_gc.h

Lines changed: 5 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,6 @@ static inline void _PyObject_GC_SET_SHARED(PyObject *op) {
131131
* When object are moved from the pending space, old[gcstate->visited_space^1]
132132
* into the increment, the old space bit is flipped.
133133
*/
134-
#define _PyGC_NEXT_MASK_OLD_SPACE_1 1
135134

136135
#define _PyGC_PREV_SHIFT 2
137136
#define _PyGC_PREV_MASK (((uintptr_t) -1) << _PyGC_PREV_SHIFT)
@@ -159,13 +158,11 @@ typedef enum {
159158
// Lowest bit of _gc_next is used for flags only in GC.
160159
// But it is always 0 for normal code.
161160
static inline PyGC_Head* _PyGCHead_NEXT(PyGC_Head *gc) {
162-
uintptr_t next = gc->_gc_next & _PyGC_PREV_MASK;
161+
uintptr_t next = gc->_gc_next;
163162
return (PyGC_Head*)next;
164163
}
165164
static inline void _PyGCHead_SET_NEXT(PyGC_Head *gc, PyGC_Head *next) {
166-
uintptr_t unext = (uintptr_t)next;
167-
assert((unext & ~_PyGC_PREV_MASK) == 0);
168-
gc->_gc_next = (gc->_gc_next & ~_PyGC_PREV_MASK) | unext;
165+
gc->_gc_next = (uintptr_t)next;
169166
}
170167

171168
// Lowest two bits of _gc_prev is used for _PyGC_PREV_MASK_* flags.
@@ -207,10 +204,6 @@ static inline void _PyGC_CLEAR_FINALIZED(PyObject *op) {
207204

208205
extern void _Py_ScheduleGC(PyThreadState *tstate);
209206

210-
#ifndef Py_GIL_DISABLED
211-
extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
212-
#endif
213-
214207

215208
/* Tell the GC to track this object.
216209
*
@@ -220,7 +213,7 @@ extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
220213
* ob_traverse method.
221214
*
222215
* Internal note: interp->gc.generation0->_gc_prev doesn't have any bit flags
223-
* because it's not object header. So we don't use _PyGCHead_PREV() and
216+
* because it's not an object header. So we don't use _PyGCHead_PREV() and
224217
* _PyGCHead_SET_PREV() for it to avoid unnecessary bitwise operations.
225218
*
226219
* See also the public PyObject_GC_Track() function.
@@ -244,19 +237,12 @@ static inline void _PyObject_GC_TRACK(
244237
"object is in generation which is garbage collected",
245238
filename, lineno, __func__);
246239

247-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
248-
PyGC_Head *generation0 = &gcstate->young.head;
240+
PyGC_Head *generation0 = _PyInterpreterState_GET()->gc.generation0;
249241
PyGC_Head *last = (PyGC_Head*)(generation0->_gc_prev);
250242
_PyGCHead_SET_NEXT(last, gc);
251243
_PyGCHead_SET_PREV(gc, last);
252-
uintptr_t not_visited = 1 ^ gcstate->visited_space;
253-
gc->_gc_next = ((uintptr_t)generation0) | not_visited;
244+
_PyGCHead_SET_NEXT(gc, generation0);
254245
generation0->_gc_prev = (uintptr_t)gc;
255-
gcstate->young.count++; /* number of tracked GC objects */
256-
gcstate->heap_size++;
257-
if (gcstate->young.count > gcstate->young.threshold) {
258-
_Py_TriggerGC(gcstate);
259-
}
260246
#endif
261247
}
262248

@@ -291,11 +277,6 @@ static inline void _PyObject_GC_UNTRACK(
291277
_PyGCHead_SET_PREV(next, prev);
292278
gc->_gc_next = 0;
293279
gc->_gc_prev &= _PyGC_PREV_MASK_FINALIZED;
294-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
295-
if (gcstate->young.count > 0) {
296-
gcstate->young.count--;
297-
}
298-
gcstate->heap_size--;
299280
#endif
300281
}
301282

Include/internal/pycore_interp_structs.h

Lines changed: 36 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -177,78 +177,55 @@ struct gc_generation {
177177
generations */
178178
};
179179

180+
struct gc_collection_stats {
181+
/* number of collected objects */
182+
Py_ssize_t collected;
183+
/* total number of uncollectable objects (put into gc.garbage) */
184+
Py_ssize_t uncollectable;
185+
// Total number of objects considered for collection and traversed:
186+
Py_ssize_t candidates;
187+
// Duration of the collection in seconds:
188+
double duration;
189+
};
190+
180191
/* Running stats per generation */
181192
struct gc_generation_stats {
182-
PyTime_t ts_start;
183-
PyTime_t ts_stop;
184-
185-
/* heap_size on the start of the collection */
186-
Py_ssize_t heap_size;
187-
188-
/* work_to_do on the start of the collection */
189-
Py_ssize_t work_to_do;
190-
191193
/* total number of collections */
192194
Py_ssize_t collections;
193-
194-
/* total number of visited objects */
195-
Py_ssize_t object_visits;
196-
197195
/* total number of collected objects */
198196
Py_ssize_t collected;
199197
/* total number of uncollectable objects (put into gc.garbage) */
200198
Py_ssize_t uncollectable;
201199
// Total number of objects considered for collection and traversed:
202200
Py_ssize_t candidates;
203-
204-
Py_ssize_t objects_transitively_reachable;
205-
Py_ssize_t objects_not_transitively_reachable;
206-
207-
// Total duration of the collection in seconds:
201+
// Duration of the collection in seconds:
208202
double duration;
209203
};
210204

211-
#ifdef Py_GIL_DISABLED
212-
#define GC_YOUNG_STATS_SIZE 1
213-
#define GC_OLD_STATS_SIZE 1
214-
#else
215-
#define GC_YOUNG_STATS_SIZE 11
216-
#define GC_OLD_STATS_SIZE 3
217-
#endif
218-
struct gc_young_stats_buffer {
219-
struct gc_generation_stats items[GC_YOUNG_STATS_SIZE];
220-
int8_t index;
221-
};
222-
223-
struct gc_old_stats_buffer {
224-
struct gc_generation_stats items[GC_OLD_STATS_SIZE];
225-
int8_t index;
226-
};
227-
228205
enum _GCPhase {
229206
GC_PHASE_MARK = 0,
230207
GC_PHASE_COLLECT = 1
231208
};
232209

233210
/* If we change this, we need to change the default value in the
234-
signature of gc.collect and change the size of PyStats.gc_stats */
211+
signature of gc.collect */
235212
#define NUM_GENERATIONS 3
236213

237-
struct gc_stats {
238-
struct gc_young_stats_buffer young;
239-
struct gc_old_stats_buffer old[2];
240-
};
241-
242214
struct _gc_runtime_state {
243215
/* Is automatic collection enabled? */
244216
int enabled;
245217
int debug;
246218
/* linked lists of container objects */
219+
#ifndef Py_GIL_DISABLED
220+
struct gc_generation generations[NUM_GENERATIONS];
221+
PyGC_Head *generation0;
222+
#else
247223
struct gc_generation young;
248224
struct gc_generation old[2];
225+
#endif
249226
/* a permanent generation which won't be collected */
250227
struct gc_generation permanent_generation;
251-
struct gc_stats *generation_stats;
228+
struct gc_generation_stats generation_stats[NUM_GENERATIONS];
252229
/* true if we are currently running the collector */
253230
int collecting;
254231
// The frame that started the current collection. It might be NULL even when
@@ -259,13 +236,6 @@ struct _gc_runtime_state {
259236
/* a list of callbacks to be invoked when collection is performed */
260237
PyObject *callbacks;
261238

262-
Py_ssize_t heap_size;
263-
Py_ssize_t work_to_do;
264-
/* Which of the old spaces is the visited space */
265-
int visited_space;
266-
int phase;
267-
268-
#ifdef Py_GIL_DISABLED
269239
/* This is the number of objects that survived the last full
270240
collection. It approximates the number of long lived objects
271241
tracked by the GC.
@@ -278,6 +248,7 @@ struct _gc_runtime_state {
278248
the first time. */
279249
Py_ssize_t long_lived_pending;
280250

251+
#ifdef Py_GIL_DISABLED
281252
/* True if gc.freeze() has been used. */
282253
int freeze_active;
283254

@@ -293,6 +264,22 @@ struct _gc_runtime_state {
293264
#endif
294265
};
295266

267+
#ifndef Py_GIL_DISABLED
268+
#define GC_GENERATION_INIT \
269+
.generations = { \
270+
{ .threshold = 2000, }, \
271+
{ .threshold = 10, }, \
272+
{ .threshold = 10, }, \
273+
},
274+
#else
275+
#define GC_GENERATION_INIT \
276+
.young = { .threshold = 2000, }, \
277+
.old = { \
278+
{ .threshold = 10, }, \
279+
{ .threshold = 10, }, \
280+
},
281+
#endif
282+
296283
#include "pycore_gil.h" // struct _gil_runtime_state
297284

298285
/**** Import ********/

Include/internal/pycore_runtime_init.h

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -130,13 +130,7 @@ extern PyTypeObject _PyExc_MemoryError;
130130
}, \
131131
.gc = { \
132132
.enabled = 1, \
133-
.young = { .threshold = 2000, }, \
134-
.old = { \
135-
{ .threshold = 10, }, \
136-
{ .threshold = 0, }, \
137-
}, \
138-
.work_to_do = -5000, \
139-
.phase = GC_PHASE_MARK, \
133+
GC_GENERATION_INIT \
140134
}, \
141135
.qsbr = { \
142136
.wr_seq = QSBR_INITIAL, \

0 commit comments

Comments
 (0)