Skip to content

Commit add9d3a

Browse files
Forward-port generational GC.
Co-Authored-By: Neil Schemenauer <nas@arctrix.com> Co-Authored-By: Zanie Blue <contact@zanie.dev> Co-Authored-By: Sergey Miryanov <sergey.miryanov@gmail.com>
1 parent 7ce737e commit add9d3a

File tree

11 files changed

+523
-1082
lines changed

11 files changed

+523
-1082
lines changed

Doc/library/gc.rst

Lines changed: 15 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -37,18 +37,11 @@ The :mod:`!gc` module provides the following functions:
3737

3838
.. function:: collect(generation=2)
3939

40-
Perform a collection. The optional argument *generation*
40+
With no arguments, run a full collection. The optional argument *generation*
4141
may be an integer specifying which generation to collect (from 0 to 2). A
4242
:exc:`ValueError` is raised if the generation number is invalid. The sum of
4343
collected objects and uncollectable objects is returned.
4444

45-
Calling ``gc.collect(0)`` will perform a GC collection on the young generation.
46-
47-
Calling ``gc.collect(1)`` will perform a GC collection on the young generation
48-
and an increment of the old generation.
49-
50-
Calling ``gc.collect(2)`` or ``gc.collect()`` performs a full collection
51-
5245
The free lists maintained for a number of built-in types are cleared
5346
whenever a full collection or collection of the highest generation (2)
5447
is run. Not all items in some free lists may be freed due to the
@@ -57,9 +50,6 @@ The :mod:`!gc` module provides the following functions:
5750
The effect of calling ``gc.collect()`` while the interpreter is already
5851
performing a collection is undefined.
5952

60-
.. versionchanged:: 3.14
61-
``generation=1`` performs an increment of collection.
62-
6353

6454
.. function:: set_debug(flags)
6555

@@ -75,20 +65,13 @@ The :mod:`!gc` module provides the following functions:
7565

7666
.. function:: get_objects(generation=None)
7767

78-
7968
Returns a list of all objects tracked by the collector, excluding the list
80-
returned. If *generation* is not ``None``, return only the objects as follows:
81-
82-
* 0: All objects in the young generation
83-
* 1: No objects, as there is no generation 1 (as of Python 3.14)
84-
* 2: All objects in the old generation
69+
returned. If *generation* is not ``None``, return only the objects tracked by
70+
the collector that are in that generation.
8571

8672
.. versionchanged:: 3.8
8773
New *generation* parameter.
8874

89-
.. versionchanged:: 3.14
90-
Generation 1 is removed
91-
9275
.. audit-event:: gc.get_objects generation gc.get_objects
9376

9477
.. function:: get_stats()
@@ -124,32 +107,26 @@ The :mod:`!gc` module provides the following functions:
124107
Set the garbage collection thresholds (the collection frequency). Setting
125108
*threshold0* to zero disables collection.
126109

127-
The GC classifies objects into two generations depending on whether they have
128-
survived a collection. New objects are placed in the young generation. If an
129-
object survives a collection it is moved into the old generation.
130-
131-
In order to decide when to run, the collector keeps track of the number of object
110+
The GC classifies objects into three generations depending on how many
111+
collection sweeps they have survived. New objects are placed in the youngest
112+
generation (generation ``0``). If an object survives a collection it is moved
113+
into the next older generation. Since generation ``2`` is the oldest
114+
generation, objects in that generation remain there after a collection. In
115+
order to decide when to run, the collector keeps track of the number object
132116
allocations and deallocations since the last collection. When the number of
133117
allocations minus the number of deallocations exceeds *threshold0*, collection
134-
starts. For each collection, all the objects in the young generation and some
135-
fraction of the old generation is collected.
118+
starts. Initially only generation ``0`` is examined. If generation ``0`` has
119+
been examined more than *threshold1* times since generation ``1`` has been
120+
examined, then generation ``1`` is examined as well.
121+
With the third generation, things are a bit more complicated,
122+
see `Collecting the oldest generation <https://github.com/python/cpython/blob/ff0ef0a54bef26fc507fbf9b7a6009eb7d3f17f5/InternalDocs/garbage_collector.md#collecting-the-oldest-generation>`_ for more information.
136123

137124
In the free-threaded build, the increase in process memory usage is also
138125
checked before running the collector. If the memory usage has not increased
139126
by 10% since the last collection and the net number of object allocations
140127
has not exceeded 40 times *threshold0*, the collection is not run.
141128

142-
The fraction of the old generation that is collected is **inversely** proportional
143-
to *threshold1*. The larger *threshold1* is, the slower objects in the old generation
144-
are collected.
145-
For the default value of 10, 1% of the old generation is scanned during each collection.
146-
147-
*threshold2* is ignored.
148-
149-
See `Garbage collector design <https://devguide.python.org/garbage_collector>`_ for more information.
150-
151-
.. versionchanged:: 3.14
152-
*threshold2* is ignored
129+
See `Garbage collector design <https://github.com/python/cpython/blob/3.15/InternalDocs/garbage_collector.md>`_ for more information.
153130

154131

155132
.. function:: get_count()

Include/internal/pycore_gc.h

Lines changed: 5 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,6 @@ static inline void _PyObject_GC_SET_SHARED(PyObject *op) {
131131
* When object are moved from the pending space, old[gcstate->visited_space^1]
132132
* into the increment, the old space bit is flipped.
133133
*/
134-
#define _PyGC_NEXT_MASK_OLD_SPACE_1 1
135134

136135
#define _PyGC_PREV_SHIFT 2
137136
#define _PyGC_PREV_MASK (((uintptr_t) -1) << _PyGC_PREV_SHIFT)
@@ -159,13 +158,11 @@ typedef enum {
159158
// Lowest bit of _gc_next is used for flags only in GC.
160159
// But it is always 0 for normal code.
161160
static inline PyGC_Head* _PyGCHead_NEXT(PyGC_Head *gc) {
162-
uintptr_t next = gc->_gc_next & _PyGC_PREV_MASK;
161+
uintptr_t next = gc->_gc_next;
163162
return (PyGC_Head*)next;
164163
}
165164
static inline void _PyGCHead_SET_NEXT(PyGC_Head *gc, PyGC_Head *next) {
166-
uintptr_t unext = (uintptr_t)next;
167-
assert((unext & ~_PyGC_PREV_MASK) == 0);
168-
gc->_gc_next = (gc->_gc_next & ~_PyGC_PREV_MASK) | unext;
165+
gc->_gc_next = (uintptr_t)next;
169166
}
170167

171168
// Lowest two bits of _gc_prev is used for _PyGC_PREV_MASK_* flags.
@@ -207,10 +204,6 @@ static inline void _PyGC_CLEAR_FINALIZED(PyObject *op) {
207204

208205
extern void _Py_ScheduleGC(PyThreadState *tstate);
209206

210-
#ifndef Py_GIL_DISABLED
211-
extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
212-
#endif
213-
214207

215208
/* Tell the GC to track this object.
216209
*
@@ -220,7 +213,7 @@ extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
220213
* ob_traverse method.
221214
*
222215
* Internal note: interp->gc.generation0->_gc_prev doesn't have any bit flags
223-
* because it's not object header. So we don't use _PyGCHead_PREV() and
216+
* because it's not an object header. So we don't use _PyGCHead_PREV() and
224217
* _PyGCHead_SET_PREV() for it to avoid unnecessary bitwise operations.
225218
*
226219
* See also the public PyObject_GC_Track() function.
@@ -244,19 +237,12 @@ static inline void _PyObject_GC_TRACK(
244237
"object is in generation which is garbage collected",
245238
filename, lineno, __func__);
246239

247-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
248-
PyGC_Head *generation0 = &gcstate->young.head;
240+
PyGC_Head *generation0 = _PyInterpreterState_GET()->gc.generation0;
249241
PyGC_Head *last = (PyGC_Head*)(generation0->_gc_prev);
250242
_PyGCHead_SET_NEXT(last, gc);
251243
_PyGCHead_SET_PREV(gc, last);
252-
uintptr_t not_visited = 1 ^ gcstate->visited_space;
253-
gc->_gc_next = ((uintptr_t)generation0) | not_visited;
244+
_PyGCHead_SET_NEXT(gc, generation0);
254245
generation0->_gc_prev = (uintptr_t)gc;
255-
gcstate->young.count++; /* number of tracked GC objects */
256-
gcstate->heap_size++;
257-
if (gcstate->young.count > gcstate->young.threshold) {
258-
_Py_TriggerGC(gcstate);
259-
}
260246
#endif
261247
}
262248

@@ -291,11 +277,6 @@ static inline void _PyObject_GC_UNTRACK(
291277
_PyGCHead_SET_PREV(next, prev);
292278
gc->_gc_next = 0;
293279
gc->_gc_prev &= _PyGC_PREV_MASK_FINALIZED;
294-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
295-
if (gcstate->young.count > 0) {
296-
gcstate->young.count--;
297-
}
298-
gcstate->heap_size--;
299280
#endif
300281
}
301282

Include/internal/pycore_interp_structs.h

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -181,29 +181,14 @@ struct gc_generation {
181181
struct gc_generation_stats {
182182
PyTime_t ts_start;
183183
PyTime_t ts_stop;
184-
185-
/* heap_size on the start of the collection */
186-
Py_ssize_t heap_size;
187-
188-
/* work_to_do on the start of the collection */
189-
Py_ssize_t work_to_do;
190-
191184
/* total number of collections */
192185
Py_ssize_t collections;
193-
194-
/* total number of visited objects */
195-
Py_ssize_t object_visits;
196-
197186
/* total number of collected objects */
198187
Py_ssize_t collected;
199188
/* total number of uncollectable objects (put into gc.garbage) */
200189
Py_ssize_t uncollectable;
201190
// Total number of objects considered for collection and traversed:
202191
Py_ssize_t candidates;
203-
204-
Py_ssize_t objects_transitively_reachable;
205-
Py_ssize_t objects_not_transitively_reachable;
206-
207192
// Total duration of the collection in seconds:
208193
double duration;
209194
};
@@ -231,7 +216,7 @@ enum _GCPhase {
231216
};
232217

233218
/* If we change this, we need to change the default value in the
234-
signature of gc.collect and change the size of PyStats.gc_stats */
219+
signature of gc.collect */
235220
#define NUM_GENERATIONS 3
236221

237222
struct gc_stats {
@@ -244,8 +229,13 @@ struct _gc_runtime_state {
244229
int enabled;
245230
int debug;
246231
/* linked lists of container objects */
232+
#ifndef Py_GIL_DISABLED
233+
struct gc_generation generations[NUM_GENERATIONS];
234+
PyGC_Head *generation0;
235+
#else
247236
struct gc_generation young;
248237
struct gc_generation old[2];
238+
#endif
249239
/* a permanent generation which won't be collected */
250240
struct gc_generation permanent_generation;
251241
struct gc_stats *generation_stats;
@@ -259,13 +249,6 @@ struct _gc_runtime_state {
259249
/* a list of callbacks to be invoked when collection is performed */
260250
PyObject *callbacks;
261251

262-
Py_ssize_t heap_size;
263-
Py_ssize_t work_to_do;
264-
/* Which of the old spaces is the visited space */
265-
int visited_space;
266-
int phase;
267-
268-
#ifdef Py_GIL_DISABLED
269252
/* This is the number of objects that survived the last full
270253
collection. It approximates the number of long lived objects
271254
tracked by the GC.
@@ -278,6 +261,7 @@ struct _gc_runtime_state {
278261
the first time. */
279262
Py_ssize_t long_lived_pending;
280263

264+
#ifdef Py_GIL_DISABLED
281265
/* True if gc.freeze() has been used. */
282266
int freeze_active;
283267

@@ -293,6 +277,22 @@ struct _gc_runtime_state {
293277
#endif
294278
};
295279

280+
#ifndef Py_GIL_DISABLED
281+
#define GC_GENERATION_INIT \
282+
.generations = { \
283+
{ .threshold = 2000, }, \
284+
{ .threshold = 10, }, \
285+
{ .threshold = 10, }, \
286+
},
287+
#else
288+
#define GC_GENERATION_INIT \
289+
.young = { .threshold = 2000, }, \
290+
.old = { \
291+
{ .threshold = 10, }, \
292+
{ .threshold = 10, }, \
293+
},
294+
#endif
295+
296296
#include "pycore_gil.h" // struct _gil_runtime_state
297297

298298
/**** Import ********/

Include/internal/pycore_runtime_init.h

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -130,13 +130,7 @@ extern PyTypeObject _PyExc_MemoryError;
130130
}, \
131131
.gc = { \
132132
.enabled = 1, \
133-
.young = { .threshold = 2000, }, \
134-
.old = { \
135-
{ .threshold = 10, }, \
136-
{ .threshold = 0, }, \
137-
}, \
138-
.work_to_do = -5000, \
139-
.phase = GC_PHASE_MARK, \
133+
GC_GENERATION_INIT \
140134
}, \
141135
.qsbr = { \
142136
.wr_seq = QSBR_INITIAL, \

0 commit comments

Comments
 (0)