You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for C and C++](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-cpp/).
* Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for C#](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-csharp/).
Copy file name to clipboardExpand all lines: docs/codeql/codeql-language-guides/customizing-library-models-for-cpp.rst
+91-13Lines changed: 91 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,6 +58,8 @@ The CodeQL library for CPP analysis exposes the following extensible predicates:
58
58
- ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data. The ``kind`` of the sources defined using this predicate determine which threat model they are associated with. Different threat models can be used to customize the sources used in an analysis. For more information, see ":ref:`Threat models <threat-models-cpp>`."
59
59
- ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data may be used in a way that makes the code vulnerable.
60
60
- ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to model flow through elements.
61
+
- ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint.
62
+
- ``barrierGuardModel(namespace, type, subtypes, name, signature, ext, input, acceptingValue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check.
61
63
62
64
The extensible predicates are populated using the models defined in data extension files.
63
65
@@ -75,7 +77,7 @@ This example shows how the CPP query pack models the return value from the ``rea
We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name, signature, ext, output, kind, provenance) extensible predicate by updating a data extension file.
80
+
We need to add a tuple to the ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)`` extensible predicate by updating a data extension file.
79
81
80
82
.. code-block:: yaml
81
83
@@ -86,12 +88,11 @@ We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name,
Since we are adding a new source, we need to add a tuple to the ``sourceModel`` extensible predicate.
90
91
The first five values identify the callable (in this case a free function) to be modeled as a source.
91
92
92
93
- The first value ``"boost::asio"`` is the namespace name.
93
-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
94
-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
94
+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
95
+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
95
96
- The fourth value ``"read_until"`` is the function name.
96
97
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``read_until``.
97
98
@@ -111,7 +112,7 @@ This example shows how the CPP query pack models the second argument of the ``bo
111
112
112
113
boost::asio::write(socket, send_buffer, error);
113
114
114
-
We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, signature, ext, input, kind, provenance) extensible predicate by updating a data extension file.
115
+
We need to add a tuple to the ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)`` extensible predicate by updating a data extension file.
115
116
116
117
.. code-block:: yaml
117
118
@@ -122,12 +123,11 @@ We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, si
Since we want to add a new sink, we need to add a tuple to the ``sinkModel`` extensible predicate.
126
126
The first five values identify the callable (in this case a free function) to be modeled as a sink.
127
127
128
128
- The first value ``"boost::asio"`` is the namespace name.
129
-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
130
-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
129
+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
130
+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
131
131
- The fourth value ``"write"`` is the function name.
132
132
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``write``.
133
133
@@ -147,7 +147,7 @@ This example shows how the CPP query pack models flow through a function for a s
We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance) extensible predicate by updating a data extension file:
150
+
We need to add tuples to the ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)`` extensible predicate by updating a data extension file:
151
151
152
152
.. code-block:: yaml
153
153
@@ -158,13 +158,11 @@ We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name,
Since we are adding flow through a function, we need to add tuples to the ``summaryModel`` extensible predicate.
162
-
163
161
The first five values identify the callable (in this case free function) to be modeled as a summary.
164
162
165
163
- The first value ``"boost::asio"`` is the namespace name.
166
-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
167
-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
164
+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
165
+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
168
166
- The fourth value ``"buffer"`` is the function name.
169
167
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``buffer``.
170
168
@@ -176,6 +174,86 @@ The remaining values are used to define the input and output specifications, the
176
174
- The ninth value ``"taint"`` is the kind of the flow. ``taint`` means that taint is propagated through the call.
177
175
- The tenth value ``"manual"`` is the provenance of the summary, which is used to identify the origin of the summary model.
178
176
177
+
Example: Taint barrier using the ``mysql_real_escape_string`` function
This example shows how the CPP query pack models the ``mysql_real_escape_string`` function as a barrier for SQL injection.
181
+
This function escapes special characters in a string for use in an SQL statement, which prevents SQL injection attacks.
182
+
183
+
.. code-block:: cpp
184
+
185
+
char *query = "SELECT * FROM users WHERE name = '%s'";
186
+
char *name = get_untrusted_input();
187
+
char *escaped_name = new char[2 * strlen(name) + 1];
188
+
mysql_real_escape_string(mysql, escaped_name, name, strlen(name)); // The escaped_name is safe for SQL injection.
189
+
sprintf(query_buffer, query, escaped_name);
190
+
191
+
We need to add a tuple to the ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)`` extensible predicate by updating a data extension file.
The first five values identify the callable (in this case a free function) to be modeled as a barrier.
203
+
204
+
- The first value ``""`` is the namespace name.
205
+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
206
+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
207
+
- The fourth value ``"mysql_real_escape_string"`` is the function name.
208
+
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name.
209
+
210
+
The sixth value should be left empty and is out of scope for this documentation.
211
+
The remaining values are used to define the output specification, the ``kind``, and the ``provenance`` (origin) of the barrier.
212
+
213
+
- The seventh value ``"Argument[*1]"`` is the output specification, which means in this case that the barrier is the first indirection (or pointed-to value, ``*``) of the second argument (``Argument[1]``) passed to the function.
214
+
- The eighth value ``"sql-injection"`` is the kind of the barrier. The barrier kind is used to define the queries where the barrier is in scope.
215
+
- The ninth value ``"manual"`` is the provenance of the barrier, which is used to identify the origin of the barrier model.
216
+
217
+
Example: Add a barrier guard
218
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219
+
220
+
This example shows how to model a barrier guard that stops the flow of taint when a conditional check is performed on data.
221
+
A barrier guard model is used when a function returns a boolean that indicates whether the data is safe to use.
222
+
Consider a function called ``is_safe`` which returns ``true`` when the data is considered safe.
223
+
224
+
.. code-block:: cpp
225
+
226
+
if (is_safe(user_input)) { // The check guards the use, so the input is safe.
227
+
mysql_query(user_input); // This is safe.
228
+
}
229
+
230
+
We need to add a tuple to the ``barrierGuardModel(namespace, type, subtypes, name, signature, ext, input, acceptingValue, kind, provenance)`` extensible predicate by updating a data extension file.
The first five values identify the callable (in this case a free function) to be modeled as a barrier guard.
242
+
243
+
- The first value ``""`` is the namespace name.
244
+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
245
+
- The third value ``False`` is a flag that indicates whether or not the model guard also applies to all overrides of the method. For a free function, this should be ``False``.
246
+
- The fourth value ``"is_safe"`` is the function name.
247
+
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name.
248
+
249
+
The sixth value should be left empty and is out of scope for this documentation.
250
+
The remaining values are used to define the input specification, the ``accepting-value``, the ``kind``, and the ``provenance`` (origin) of the barrier guard.
251
+
252
+
- The seventh value ``Argument[*0]`` is the input specification (the value being validated). In this case, the first indirection (or pointed-to value, ``*``) of the first argument (``Argument[0]``) passed to the function.
253
+
- The eighth value ``true`` is the accepting value of the barrier guard. This is the value that the conditional check must return for the barrier to apply.
254
+
- The ninth value ``sql-injection`` is the kind of the barrier guard. The barrier guard kind is used to define the queries where the barrier guard is in scope.
255
+
- The tenth value ``manual`` is the provenance of the barrier guard, which is used to identify the origin of the barrier guard.
0 commit comments