Discussion:
[Cocci] Finding embedded function names?
Joe Perches
2014-12-05 01:12:31 UTC
Permalink
Is it possible for coccinelle to look at the name
of a function that might be embedded in a format?

ie for:

void testme(void)
{
printf("testme: some message\n");
}

Can it find the "testme" embedded in a format string?
Julia Lawall
2014-12-05 07:18:18 UTC
Permalink
Post by Joe Perches
Is it possible for coccinelle to look at the name
of a function that might be embedded in a format?
void testme(void)
{
printf("testme: some message\n");
}
Can it find the "testme" embedded in a format string?
Yes, by using python/ocaml:

@r@
char [] c;
position p;
identifier f;
@@

f(...,***@p,...)

@script:ocaml@
c << r.c;
p << r.p;
@@

let ce = (List.hd p).current_element in
if List.length(Str.split_delim (Str.regexp ce) c) > 1
then Printf.printf "%s:%d: %s\n"
(List.hd p).file (List.hd p).line c

Here are some results:

drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
drivers/net/wireless/zd1211rw/zd_usb.c:1551: "%s usb_init()\n"
drivers/net/wireless/zd1211rw/zd_usb.c:1453: "disconnected\n"
drivers/net/wireless/rtlwifi/rtl8723be/dm.c:800:
"rtl8723be_dm_txpower_tracking_callback_thermalmeter\n"
drivers/net/wireless/rtlwifi/rtl8723be/fw.c:461:
"rtl8723be_set_fw_rsvdpagepkt(): HW_VAR_SET_TX_CMD: ALL\n"
drivers/net/wireless/rtlwifi/rtl8723be/fw.c:464:
"rtl8723be_set_fw_rsvdpagepkt(): HW_VAR_SET_TX_CMD: ALL\n"
drivers/net/wireless/rtlwifi/rtl8723be/phy.c:451:
"<===_rtl8723be_phy_convert_txpower_dbm_to_relative_value()\n"
drivers/net/wireless/rtlwifi/rtl8192ee/fw.c:721:
"rtl92ee_set_fw_rsvdpagepkt(): HW_VAR_SET_TX_CMD: ALL\n"
drivers/net/wireless/rtlwifi/rtl8192ee/fw.c:724:
"rtl92ee_set_fw_rsvdpagepkt(): HW_VAR_SET_TX_CMD: ALL\n"
drivers/net/wireless/rtlwifi/rtl8192ee/phy.c:635:
"<==phy_convert_txpwr_dbm_to_rel_val()\n"

The idea would be to replace these by %s and __func__? That would also be
possible.

julia
Joe Perches
2014-12-05 07:28:36 UTC
Permalink
Post by Julia Lawall
Post by Joe Perches
Is it possible for coccinelle to look at the name
of a function that might be embedded in a format?
void testme(void)
{
printf("testme: some message\n");
}
Can it find the "testme" embedded in a format string?
@r@
char [] c;
position p;
identifier f;
@@
@script:ocaml@
c << r.c;
p << r.p;
@@
let ce = (List.hd p).current_element in
if List.length(Str.split_delim (Str.regexp ce) c) > 1
then Printf.printf "%s:%d: %s\n"
(List.hd p).file (List.hd p).line c
Good to know, thanks.
Post by Julia Lawall
drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
[]
Post by Julia Lawall
The idea would be to replace these by %s and __func__? That would also be
possible.
Yes and no.

A lot of these are function tracing style messages and
those should just be deleted.

Otherwise, yes.
Julia Lawall
2014-12-05 07:32:33 UTC
Permalink
Post by Joe Perches
Post by Julia Lawall
Post by Joe Perches
Is it possible for coccinelle to look at the name
of a function that might be embedded in a format?
void testme(void)
{
printf("testme: some message\n");
}
Can it find the "testme" embedded in a format string?
@r@
char [] c;
position p;
identifier f;
@@
@script:ocaml@
c << r.c;
p << r.p;
@@
let ce = (List.hd p).current_element in
if List.length(Str.split_delim (Str.regexp ce) c) > 1
then Printf.printf "%s:%d: %s\n"
(List.hd p).file (List.hd p).line c
Good to know, thanks.
Post by Julia Lawall
drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
[]
Post by Julia Lawall
The idea would be to replace these by %s and __func__? That would also be
possible.
Yes and no.
A lot of these are function tracing style messages and
those should just be deleted.
Would it be possible to characterize what a function tracing style message
would be? Something like
"<===_rtl8723be_phy_convert_txpower_dbm_to_relative_value()\n"
that looks like it may represent a function exit?

Anything that doesn't contain %?

Anything that is not under an if?

julia
Joe Perches
2014-12-05 07:49:30 UTC
Permalink
Post by Julia Lawall
Post by Joe Perches
Post by Julia Lawall
Post by Joe Perches
Is it possible for coccinelle to look at the name
of a function that might be embedded in a format?
[]
[]
Post by Julia Lawall
Post by Joe Perches
Post by Julia Lawall
drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
[]
Post by Julia Lawall
The idea would be to replace these by %s and __func__? That would also be
possible.
Yes and no.
A lot of these are function tracing style messages and
those should just be deleted.
Would it be possible to characterize what a function tracing style message
would be? Something like
"<===_rtl8723be_phy_convert_txpower_dbm_to_relative_value()\n"
that looks like it may represent a function exit?
Anything that doesn't contain %?
Some of these tracing style messages are also
emitting arguments. I'd still delete those.
Post by Julia Lawall
Anything that is not under an if?
For function entrances:
Maybe something in a function's first few non-definition
statements not under an if/while/do/for/switch?

For function exits:
Maybe the last couple statements just before a return
statement or ending brace (also not under an if/etc)

Rasmus Villemoes
2014-12-05 15:14:26 UTC
Permalink
Post by Joe Perches
Post by Julia Lawall
@r@
char [] c;
position p;
identifier f;
@@
@script:ocaml@
c << r.c;
p << r.p;
@@
let ce = (List.hd p).current_element in
if List.length(Str.split_delim (Str.regexp ce) c) > 1
then Printf.printf "%s:%d: %s\n"
(List.hd p).file (List.hd p).line c
Good to know, thanks.
Post by Julia Lawall
drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
[]
Post by Julia Lawall
The idea would be to replace these by %s and __func__? That would also be
possible.
Yes and no.
A lot of these are function tracing style messages and
those should just be deleted.
Hardcoding the function name in a literal string also makes typos (or
copy-pastos) possible. I extended Julia's code to allow a small edit
distance. Requires the Levenshtein python module (on debian, apt-get
install python-levenshtein). It's not terribly slow, but to be really
useful I (or someone) would need to reduce the number of false positives.

One gets for example

drivers/net/wireless/rtlwifi/rtl8821ae/dm.c:2082:rtl8821ae_dm_txpwr_track_set_pwr():2: "===>rtl8812ae_dm_txpwr_track_set_pwr\n"
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:1229:bnx2x_ets_e3b0_config():2: "bnx2x_ets_E3B0_config SP failed\n"

The first of these is probably one of the tracing style messages; the
second probably falls in the 'should use %s, __func__' category (other
strings in that function actually use lowercase e3b0).


@initialize:python@
@@
import re
from Levenshtein import distance
mindist = 1
maxdist = 2
ignore_leading = True

@r@
char [] c;
position p;
identifier f;
@@

f(...,***@p,...)

@script:python@
c << r.c;
p << r.p;
@@

func = p[0].current_element
wpattern = "[a-zA-Z_][a-zA-Z0-9_]*"
if ignore_leading:
func = func.strip("_")
wpattern = "[a-zA-Z][a-zA-Z0-9_]*"
lf = len(func)
// ignore extremely short function names
if lf > 3:
words = [w for w in re.findall(wpattern, c) if abs(len(w) - lf) <= maxdist]
for w in words:
d = distance(w, func)
if mindist <= d and d <= maxdist:
print "%s:%d:%s():%d: %s" % (p[0].file, int(p[0].line), func, d, c)
break



Rasmus
Julia Lawall
2014-12-05 15:32:16 UTC
Permalink
Post by Rasmus Villemoes
Post by Joe Perches
Post by Julia Lawall
@r@
char [] c;
position p;
identifier f;
@@
@script:ocaml@
c << r.c;
p << r.p;
@@
let ce = (List.hd p).current_element in
if List.length(Str.split_delim (Str.regexp ce) c) > 1
then Printf.printf "%s:%d: %s\n"
(List.hd p).file (List.hd p).line c
Good to know, thanks.
Post by Julia Lawall
drivers/net/wireless/zd1211rw/zd_usb.c:1573: "%s usb_exit()\n"
[]
Post by Julia Lawall
The idea would be to replace these by %s and __func__? That would also be
possible.
Yes and no.
A lot of these are function tracing style messages and
those should just be deleted.
Hardcoding the function name in a literal string also makes typos (or
copy-pastos) possible. I extended Julia's code to allow a small edit
distance. Requires the Levenshtein python module (on debian, apt-get
install python-levenshtein). It's not terribly slow, but to be really
useful I (or someone) would need to reduce the number of false positives.
Cool. Thanks!

julia
Post by Rasmus Villemoes
One gets for example
drivers/net/wireless/rtlwifi/rtl8821ae/dm.c:2082:rtl8821ae_dm_txpwr_track_set_pwr():2: "===>rtl8812ae_dm_txpwr_track_set_pwr\n"
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:1229:bnx2x_ets_e3b0_config():2: "bnx2x_ets_E3B0_config SP failed\n"
The first of these is probably one of the tracing style messages; the
second probably falls in the 'should use %s, __func__' category (other
strings in that function actually use lowercase e3b0).
@initialize:python@
@@
import re
from Levenshtein import distance
mindist = 1
maxdist = 2
ignore_leading = True
@r@
char [] c;
position p;
identifier f;
@@
@script:python@
c << r.c;
p << r.p;
@@
func = p[0].current_element
wpattern = "[a-zA-Z_][a-zA-Z0-9_]*"
func = func.strip("_")
wpattern = "[a-zA-Z][a-zA-Z0-9_]*"
lf = len(func)
// ignore extremely short function names
words = [w for w in re.findall(wpattern, c) if abs(len(w) - lf) <= maxdist]
d = distance(w, func)
print "%s:%d:%s():%d: %s" % (p[0].file, int(p[0].line), func, d, c)
break
Rasmus
Joe Perches
2014-12-05 15:54:34 UTC
Permalink
Post by Rasmus Villemoes
Hardcoding the function name in a literal string also makes typos (or
copy-pastos) possible. I extended Julia's code to allow a small edit
distance. Requires the Levenshtein python module (on debian, apt-get
install python-levenshtein). It's not terribly slow, but to be really
useful I (or someone) would need to reduce the number of false positives.
I agree with Julia, cool, thanks.
Continue reading on narkive:
Loading...