Discussion:
[Cocci] [Fwd: Re: [PATCH] rtc: sun6i: Use struct_size() in kzalloc()]
Kees Cook
2018-08-23 21:56:19 UTC
Permalink
Forwarding a question about coccinelle and isomorphisms from Kees Cook
---------- Forwarded message ----------
Date: Thu, 23 Aug 2018 13:56:35 -0700
Subject: Re: [PATCH] rtc: sun6i: Use struct_size() in kzalloc()
On Thu, Aug 23, 2018 at 11:51 AM, Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
---
drivers/rtc/rtc-sun6i.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 2cd5a7b..fe07310 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -199,8 +199,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node)
if (!rtc)
return;
- clk_data = kzalloc(sizeof(*clk_data) + (sizeof(*clk_data->hws) * 2),
- GFP_KERNEL);
+ clk_data = kzalloc(struct_size(clk_data, hws, 2), GFP_KERNEL);
if (!clk_data) {
kfree(rtc);
return;
This looks like entirely correct to me, but I'm surprised the
Coccinelle script didn't discover this. I guess the isomorphisms don't
cover the parenthesis?
I had this:

@@
identifier alloc =~
"kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc";
identifier VAR, ELEMENT;
expression COUNT;
@@

- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)

But I needed to explicitly change the rule to:

(
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
|
- alloc(sizeof(*VAR) + (COUNT * sizeof(*VAR->ELEMENT))
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
)

to add the ()s. I would expect arithmetic commutative expressions to
match? But I had to add parens?

-Kees
--
Kees Cook
Pixel Security
Julia Lawall
2018-08-23 22:00:04 UTC
Permalink
Post by Kees Cook
Forwarding a question about coccinelle and isomorphisms from Kees Cook
---------- Forwarded message ----------
Date: Thu, 23 Aug 2018 13:56:35 -0700
Subject: Re: [PATCH] rtc: sun6i: Use struct_size() in kzalloc()
On Thu, Aug 23, 2018 at 11:51 AM, Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
---
drivers/rtc/rtc-sun6i.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 2cd5a7b..fe07310 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -199,8 +199,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node)
if (!rtc)
return;
- clk_data = kzalloc(sizeof(*clk_data) + (sizeof(*clk_data->hws) * 2),
- GFP_KERNEL);
+ clk_data = kzalloc(struct_size(clk_data, hws, 2), GFP_KERNEL);
if (!clk_data) {
kfree(rtc);
return;
This looks like entirely correct to me, but I'm surprised the
Coccinelle script didn't discover this. I guess the isomorphisms don't
cover the parenthesis?
@@
identifier alloc =~
"kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc";
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
(
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
|
- alloc(sizeof(*VAR) + (COUNT * sizeof(*VAR->ELEMENT))
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
)
to add the ()s. I would expect arithmetic commutative expressions to
match? But I had to add parens?
Isomorphisms don't add parens. They only remove them. If they added
them, you would end up with the possibility of having them everywhere, in
all permutations, which would be slow and useless.

julia
Kees Cook
2018-08-23 22:06:41 UTC
Permalink
Post by Julia Lawall
Post by Kees Cook
Forwarding a question about coccinelle and isomorphisms from Kees Cook
---------- Forwarded message ----------
Date: Thu, 23 Aug 2018 13:56:35 -0700
Subject: Re: [PATCH] rtc: sun6i: Use struct_size() in kzalloc()
On Thu, Aug 23, 2018 at 11:51 AM, Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
---
drivers/rtc/rtc-sun6i.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 2cd5a7b..fe07310 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -199,8 +199,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node)
if (!rtc)
return;
- clk_data = kzalloc(sizeof(*clk_data) + (sizeof(*clk_data->hws) * 2),
- GFP_KERNEL);
+ clk_data = kzalloc(struct_size(clk_data, hws, 2), GFP_KERNEL);
if (!clk_data) {
kfree(rtc);
return;
This looks like entirely correct to me, but I'm surprised the
Coccinelle script didn't discover this. I guess the isomorphisms don't
cover the parenthesis?
@@
identifier alloc =~
"kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc";
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
(
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
|
- alloc(sizeof(*VAR) + (COUNT * sizeof(*VAR->ELEMENT))
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
)
to add the ()s. I would expect arithmetic commutative expressions to
match? But I had to add parens?
Isomorphisms don't add parens. They only remove them. If they added
them, you would end up with the possibility of having them everywhere, in
all permutations, which would be slow and useless.
Would a rule for:

a + (b * c)

match:

a + b * c

?

-Kees
--
Kees Cook
Pixel Security
Julia Lawall
2018-08-23 22:13:32 UTC
Permalink
Post by Kees Cook
Post by Julia Lawall
Post by Kees Cook
Forwarding a question about coccinelle and isomorphisms from Kees Cook
---------- Forwarded message ----------
Date: Thu, 23 Aug 2018 13:56:35 -0700
Subject: Re: [PATCH] rtc: sun6i: Use struct_size() in kzalloc()
On Thu, Aug 23, 2018 at 11:51 AM, Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
---
drivers/rtc/rtc-sun6i.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 2cd5a7b..fe07310 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -199,8 +199,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node)
if (!rtc)
return;
- clk_data = kzalloc(sizeof(*clk_data) + (sizeof(*clk_data->hws) * 2),
- GFP_KERNEL);
+ clk_data = kzalloc(struct_size(clk_data, hws, 2), GFP_KERNEL);
if (!clk_data) {
kfree(rtc);
return;
This looks like entirely correct to me, but I'm surprised the
Coccinelle script didn't discover this. I guess the isomorphisms don't
cover the parenthesis?
@@
identifier alloc =~
"kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc";
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
(
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT)
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
|
- alloc(sizeof(*VAR) + (COUNT * sizeof(*VAR->ELEMENT))
+ alloc(struct_size(VAR, ELEMENT, COUNT)
, ...)
)
to add the ()s. I would expect arithmetic commutative expressions to
match? But I had to add parens?
Isomorphisms don't add parens. They only remove them. If they added
them, you would end up with the possibility of having them everywhere, in
all permutations, which would be slow and useless.
a + (b * c)
a + b * c
I would say yes. Basically it removes the parentheses but doesn't reparse
the code, so it doesn't redo the associativity. Since * has higher
precedence than +, then both will be matched. On the other hand, if you
put:

(a + b) * c

It will consider a pattern with the parentheses removed, but that pattern
won't match anything, because real trees that consist of a + b * c are
parsed in a different way.

julia
Kees Cook
2018-08-23 22:27:52 UTC
Permalink
Post by Julia Lawall
(a + b) * c
It will consider a pattern with the parentheses removed, but that pattern
won't match anything, because real trees that consist of a + b * c are
parsed in a different way.
$ spatch --version
spatch version 1.0.4 with Python support and with PCRE support
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = a * b + c;
But "(a * b) + c" for the rule does:


$ cat isomorphisms.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
int a, b, c;

a = atoi(argv[1]);
b = atoi(argv[2]);
c = atoi(argv[3]);

printf("%d\n", a + b * c);
printf("%d\n", (a + b) * c);
printf("%d\n", a + (b * c));
printf("%d\n", b * c + a);
printf("%d\n", b * (c + a));
printf("%d\n", (b * c) + a);

return 0;
}
$ cat match.cocci
@@
identifier A, B, C;
@@

* (A * B) + C

$ spatch -sp-file match.cocci isomorphisms.c
init_defs_builtins: /usr/lib/coccinelle/standard.h
HANDLING: isomorphisms.c
diff =
--- isomorphisms.c
+++ /tmp/cocci-output-8402-870fd6-isomorphisms.c
@@ -9,12 +9,8 @@ int main(int argc, char *argv[])
b = atoi(argv[2]);
c = atoi(argv[3]);

- printf("%d\n", a + b * c);
printf("%d\n", (a + b) * c);
- printf("%d\n", a + (b * c));
- printf("%d\n", b * c + a);
printf("%d\n", b * (c + a));
- printf("%d\n", (b * c) + a);

return 0;
}


So it sounds like I should put parens around all kinds of things in my rules...

-Kees
--
Kees Cook
Pixel Security
Julia Lawall
2018-08-23 22:45:18 UTC
Permalink
Post by Kees Cook
Post by Julia Lawall
(a + b) * c
It will consider a pattern with the parentheses removed, but that pattern
won't match anything, because real trees that consist of a + b * c are
parsed in a different way.
$ spatch --version
spatch version 1.0.4 with Python support and with PCRE support
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = a * b + c;
$ cat isomorphisms.c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int a, b, c;
a = atoi(argv[1]);
b = atoi(argv[2]);
c = atoi(argv[3]);
printf("%d\n", a + b * c);
printf("%d\n", (a + b) * c);
printf("%d\n", a + (b * c));
printf("%d\n", b * c + a);
printf("%d\n", b * (c + a));
printf("%d\n", (b * c) + a);
return 0;
}
$ cat match.cocci
@@
identifier A, B, C;
@@
* (A * B) + C
$ spatch -sp-file match.cocci isomorphisms.c
init_defs_builtins: /usr/lib/coccinelle/standard.h
HANDLING: isomorphisms.c
diff =
--- isomorphisms.c
+++ /tmp/cocci-output-8402-870fd6-isomorphisms.c
@@ -9,12 +9,8 @@ int main(int argc, char *argv[])
b = atoi(argv[2]);
c = atoi(argv[3]);
- printf("%d\n", a + b * c);
printf("%d\n", (a + b) * c);
- printf("%d\n", a + (b * c));
- printf("%d\n", b * c + a);
printf("%d\n", b * (c + a));
- printf("%d\n", (b * c) + a);
return 0;
}
So it sounds like I should put parens around all kinds of things in my rules...
If you think someone might reasonably use parentheses somewhere, you
should put them.

julia
Julia Lawall
2018-08-24 00:17:16 UTC
Permalink
Post by Julia Lawall
(a + b) * c
It will consider a pattern with the parentheses removed, but that pattern
won't match anything, because real trees that consist of a + b * c are
parsed in a different way.
$ spatch --version
spatch version 1.0.4 with Python support and with PCRE support
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = a * b + c;
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = (a * b) + c;
$ cat test_mul.c
int a, b, c, d;
void foo(void)
{
d = ((a * b) + c);
d = ((a * b)) + c;
d = (a * b) + c;
d = a * b + c;
}
$ spatch -sp-file match_mul.cocci test_mul.c
init_defs_builtins: /usr/lib/coccinelle/standard.h
HANDLING: test_mul.c
diff =
--- test_mul.c
+++ /tmp/cocci-output-23856-7e83f7-test_mul.c
@@ -4,6 +4,4 @@ void foo(void)
{
d = ((a * b) + c);
d = ((a * b)) + c;
- d = (a * b) + c;
- d = a * b + c;
}
I'm not sure to get the point.

If you think that people may fully parenthesize the code, then ((a * b) +
c) would be a reasonable option for the pattern. I'm not sure why double
parentheses, ((a * b)) + c, should exist in the first place.

julia
Joe Perches
2018-08-24 00:25:51 UTC
Permalink
Post by Julia Lawall
Post by Julia Lawall
(a + b) * c
It will consider a pattern with the parentheses removed, but that pattern
won't match anything, because real trees that consist of a + b * c are
parsed in a different way.
$ spatch --version
spatch version 1.0.4 with Python support and with PCRE support
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = a * b + c;
$ cat match_mul.cocci
@@
expression a, b, c;
int d;
@@
* d = (a * b) + c;
$ cat test_mul.c
int a, b, c, d;
void foo(void)
{
d = ((a * b) + c);
d = ((a * b)) + c;
d = (a * b) + c;
d = a * b + c;
}
$ spatch -sp-file match_mul.cocci test_mul.c
init_defs_builtins: /usr/lib/coccinelle/standard.h
HANDLING: test_mul.c
diff =
--- test_mul.c
+++ /tmp/cocci-output-23856-7e83f7-test_mul.c
@@ -4,6 +4,4 @@ void foo(void)
{
d = ((a * b) + c);
d = ((a * b)) + c;
- d = (a * b) + c;
- d = a * b + c;
}
I'm not sure to get the point.
If you think that people may fully parenthesize the code, then ((a * b) +
c) would be a reasonable option for the pattern. I'm not sure why double
parentheses, ((a * b)) + c, should exist in the first place.
People who write code do a lot of unnecessary things.
It's code I've seen in the past.
It occurs more with expressions than identifiers.

It'd be nice if coccinelle could parse arbitrarily
parenthesized code and script down to its minimal
logical requirements and match as maximally as possible.

cheers, Joe

Loading...