Discussion:
[Cocci] =~ runtime improvements?
Kees Cook
2018-09-27 18:51:15 UTC
Permalink
Hi,

This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
the use of the "=~" operand:

// Replace multi-factor out-of-line products with array_size() usage.
@@
identifier alloc =~ ".*alloc.*";
constant C1, C2, C3;
identifier ISTRIDE, ISIZE, ICOUNT;
expression ESTRIDE, ESIZE, ECOUNT;
expression PRODUCT, OTHER;
@@

(
PRODUCT = ((C1)) * ((C2)) * ((C3))
|
PRODUCT = ((C1)) * ((C2))
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ISIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ECOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISIZE))
+ PRODUCT = array_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ICOUNT, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ECOUNT, ESIZE)
)
... when != PRODUCT = OTHER
alloc(..., PRODUCT, ...)

Thanks!

-Kees
--
Kees Cook
Pixel Security
Julia Lawall
2018-09-27 21:09:26 UTC
Permalink
Post by Kees Cook
Hi,
This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
// Replace multi-factor out-of-line products with array_size() usage.
@@
identifier alloc =~ ".*alloc.*";
constant C1, C2, C3;
identifier ISTRIDE, ISIZE, ICOUNT;
expression ESTRIDE, ESIZE, ECOUNT;
expression PRODUCT, OTHER;
@@
(
PRODUCT = ((C1)) * ((C2)) * ((C3))
|
PRODUCT = ((C1)) * ((C2))
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ISIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ECOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISIZE))
+ PRODUCT = array_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ICOUNT, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ECOUNT, ESIZE)
)
... when != PRODUCT = OTHER
alloc(..., PRODUCT, ...)
The rule contains ... and it doesn't contain anything much concrete.
Regular expressions aren't used to select files, so you consider all
files. Big disjunctions are also costly. You could consider making a
series of 9 rules. Do you need the double parentheses? That adds more
disjunctions.

julia
Post by Kees Cook
Thanks!
-Kees
--
Kees Cook
Pixel Security
_______________________________________________
Cocci mailing list
https://systeme.lip6.fr/mailman/listinfo/cocci
Lars-Peter Clausen
2018-09-30 15:06:22 UTC
Permalink
Post by Kees Cook
Hi,
This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
Maybe I'm missing something, but do you need all of those variations? An
expression should match an identifier. I'd expect ((ECOUNT)) *
((ESTRIDE)) * ((ESIZE)) matches the superset of all the other statements
with ICOUNT, ISIZE and ISTRIDE in them. So you only need two rules one
for array_size and one for array3_size.
Post by Kees Cook
// Replace multi-factor out-of-line products with array_size() usage.
@@
identifier alloc =~ ".*alloc.*";
constant C1, C2, C3;
identifier ISTRIDE, ISIZE, ICOUNT;
expression ESTRIDE, ESIZE, ECOUNT;
expression PRODUCT, OTHER;
@@
(
PRODUCT = ((C1)) * ((C2)) * ((C3))
|
PRODUCT = ((C1)) * ((C2))
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ISIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ECOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISIZE))
+ PRODUCT = array_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ICOUNT, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ECOUNT, ESIZE)
)
... when != PRODUCT = OTHER
alloc(..., PRODUCT, ...)
Thanks!
-Kees
Julia Lawall
2018-09-30 15:40:09 UTC
Permalink
Post by Lars-Peter Clausen
Post by Kees Cook
Hi,
This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
Maybe I'm missing something, but do you need all of those variations? An
expression should match an identifier. I'd expect ((ECOUNT)) *
((ESTRIDE)) * ((ESIZE)) matches the superset of all the other statements
with ICOUNT, ISIZE and ISTRIDE in them. So you only need two rules one
for array_size and one for array3_size.
I agree about the indentifiers and expressions, although he also needs
some rules for the constant case.

thanks,
julia
Post by Lars-Peter Clausen
Post by Kees Cook
// Replace multi-factor out-of-line products with array_size() usage.
@@
identifier alloc =~ ".*alloc.*";
constant C1, C2, C3;
identifier ISTRIDE, ISIZE, ICOUNT;
expression ESTRIDE, ESIZE, ECOUNT;
expression PRODUCT, OTHER;
@@
(
PRODUCT = ((C1)) * ((C2)) * ((C3))
|
PRODUCT = ((C1)) * ((C2))
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ISIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ISTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ICOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESTRIDE)) * ((ESIZE))
+ PRODUCT = array3_size(ECOUNT, ESTRIDE, ESIZE)
|
- PRODUCT = ((ICOUNT)) * ((ISIZE))
+ PRODUCT = array_size(ICOUNT, ISTRIDE, ISIZE)
|
- PRODUCT = ((ICOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ICOUNT, ESIZE)
|
- PRODUCT = ((ECOUNT)) * ((ESIZE))
+ PRODUCT = array_size(ECOUNT, ESIZE)
)
... when != PRODUCT = OTHER
alloc(..., PRODUCT, ...)
Thanks!
-Kees
_______________________________________________
Cocci mailing list
https://systeme.lip6.fr/mailman/listinfo/cocci
Kees Cook
2018-09-30 16:54:30 UTC
Permalink
Post by Julia Lawall
Post by Lars-Peter Clausen
Post by Kees Cook
Hi,
This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
Maybe I'm missing something, but do you need all of those variations? An
expression should match an identifier. I'd expect ((ECOUNT)) *
((ESTRIDE)) * ((ESIZE)) matches the superset of all the other statements
with ICOUNT, ISIZE and ISTRIDE in them. So you only need two rules one
for array_size and one for array3_size.
I agree about the indentifiers and expressions, although he also needs
some rules for the constant case.
I had to go progressively to exclude cases in an attempt to isolate
individual factors. For example:

E1 * E2

will match:

var1 * var2 * var3

In order to make a best-effort at extracting the multiplication
factors, I need to go in order from constants (ignore) to identifiers
(explicitly correct) to expressions (may overly match)

But yes, it seems the problem is mainly the "..." part, which is unavoidable.

-Kees
--
Kees Cook
Pixel Security
Julia Lawall
2018-09-30 17:11:27 UTC
Permalink
Post by Kees Cook
Post by Julia Lawall
Post by Lars-Peter Clausen
Post by Kees Cook
Hi,
This .cocci takes a VERY long time to run against the kernel, and I'd
love to know what I could do to improve it. I assume it's related to
Maybe I'm missing something, but do you need all of those variations? An
expression should match an identifier. I'd expect ((ECOUNT)) *
((ESTRIDE)) * ((ESIZE)) matches the superset of all the other statements
with ICOUNT, ISIZE and ISTRIDE in them. So you only need two rules one
for array_size and one for array3_size.
I agree about the indentifiers and expressions, although he also needs
some rules for the constant case.
I had to go progressively to exclude cases in an attempt to isolate
E1 * E2
var1 * var2 * var3
I think you could just do E1 * E2 * E3 before E1 * E2?
Post by Kees Cook
In order to make a best-effort at extracting the multiplication
factors, I need to go in order from constants (ignore) to identifiers
(explicitly correct) to expressions (may overly match)
But yes, it seems the problem is mainly the "..." part, which is unavoidable.
Maybe it would be good to have some special cases? Are the
multiplications often right next to the allocation? Or if there is
something that is often between them, then it could be useful to make a
special case for that.

julia

Continue reading on narkive:
Loading...