FLTK logo

STR #3176

FLTK matrix user chat room
(using Element browser app)   FLTK gitter user chat room   GitHub FLTK Project   FLTK News RSS Feed  
  FLTK Apps      FLTK Library      Forums      Links     Login 
 Home  |  Articles & FAQs  |  Bugs & Features  |  Documentation  |  Download  |  Screenshots  ]
 

Return to Bugs & Features | Roadmap 1.3 | SVN ⇄ GIT ]

STR #3176

Application:FLTK Library
Status:1 - Closed w/Resolution
Priority:3 - Moderate, e.g. unable to compile the software
Scope:3 - Applies to all machines and operating systems
Subsystem:Core Library
Summary:Fl_Menu_Item::setonly() can overrun memory when radio buttons at top of menu
Version:1.3-current
Created By:greg.ercolano
Assigned To:greg.ercolano
Fix Version:1.3.4 (SVN: v10647)
Update Notification:

Receive EMails Don't Receive EMails

Trouble Report Files:


Name/Time/Date Filename/Size  
 
#1 greg.ercolano
08:01 Jan 08, 2015
test-bug.cxx
1k
 
 
#2 greg.ercolano
08:01 Jan 08, 2015
v1.patch
2k
 
 
#3 greg.ercolano
08:02 Jan 08, 2015
v2.patch
3k
 
 
#4 AlbrechtS
07:11 Jan 11, 2015
test-bug_as.cxx
3k
 
 
#5 AlbrechtS
07:23 Jan 11, 2015
v3.patch
5k
 
 
#6 AlbrechtS
08:23 Jan 11, 2015
v4.patch
3k
 
 
#7 manolo
09:22 Mar 19, 2015
v5.patch
3k
 
 
#8 AlbrechtS
17:14 Mar 19, 2015
v6.patch
3k
 
 
#9 AlbrechtS
09:06 Mar 21, 2015
performance-test.txt
2k
 
     

Trouble Report Comments:


Name/Time/Date Text  
 
#1 greg.ercolano
07:37 Jan 08, 2015
Creating this STR on behalf of 'snowcoder' who posted this on fltk.general (01/04/2015):

"""
Subject: Sanity check, fundamental bug in setonly() or known limitation?

Am I missing some documented limitation about radio items in a menu, or this there pretty nasty bug in Fl_Menu_Item::setonly when you have a menu that starts with radio menu items. Basically along the lines of:

Fl_Menu_Item myMenu[] = { ... };
...
myMenu[0].setonly();

(setonly could also be called by Fl_Menu_::picked)


This particular part of Fl_Menu_Item::setonly()

  for (j = this-1; ; j--) { // go up
    if (!j->text || (j->flags&FL_MENU_DIVIDER) || !j->radio()) break;
    j->clear();
  }

will happily access memory outside of the Fl_Menu_Item array, and even worse, overwrite that memory, if you aren't lucky and the random mem contents meets the break-condition before any harm is done.

I've tried find some mention in the docs, but I couldn't see anything that says you're not allowed to start a menu with radio items.
"""

Apparently this problem has always been in FLTK.
 
 
#2 greg.ercolano
08:01 Jan 08, 2015
OK, so a long fltk.general thread came out of this issue,
and was moved to fltk.coredev for further discussion.

The reduction of this thread is basically:

    1. Yes, this is a problem.

    2. The problem with Fl_Menu_Item::setonly() is difficult to solve
       because items don't have back references to the parent, which
       knows the beginning of the array.

    3. Several solutions were put forward by Albrecht and Greg
       to prevent walking off the top of the array..

          a. Albrecht recommended adding an FL_MENU_START flag to indicate
             the start of the menu. Doesn't fix old code, but could make new
             code stable.

          b. Greg recommended deprecating Fl_Menu_Item::setonly() in favor of
             re-implementing the method as Fl_Menu_::setonly(), so that menu()
             is accessible to detect the top of the array.

    4. Greg posted some example code that demonstrates the problem (attached here as test-bug.cxx)
       and included a simple patch suggestion to solve the issue. (attached here as v1.patch)

    5. Manolo pointed out that the patch didn't handle the issue of FL_SUBMENU_POINTER,
       and provided a modified patch that included the footwork needed to detect this
       tricky situation and handle it. (attached here as v2.patch)

Albrecht would like to do some boundary case tests, but it seems manolo's version of the
patch might be the best final solution.

Regarding #4, Greg writes how to use the test-bug.cxx program:
    """
    Depending on your system's random memory contents, choosing item "One"
    will setonly() the "Red" radio button, and it might or might not crash.

    But you can *see* how that second loop in Fl_Menu_Item::setonly()
    walks "up" and off the top of the array *if* you add some printf() statements
    to [the loop].
    """
..and that to test this newer code (with either v1.patch or v2.patch), you'd need to
replace the following line in test-bug.cxx:
  if ( strcmp(m->label(), "One") == 0 ) G_item->setonly(); // "old" way
..with this "safer" code:
  if ( strcmp(m->label(), "One") == 0 ) mw->setonly(G_item); // "new" way
 
 
#3 greg.ercolano
17:04 Jan 08, 2015
It's also suggest we add some doxygen docs (not shown in the patches):

    o Clarifying how to use RADIO buttons, with both descriptions
      and by example code.

    o Describe the workaround for older versions of FLTK that still use
      the old and now deprecated Fl_Menu_Item::setonly(), which might
      involve adding invisible non-radio items at the top of the menu
      array, above the first radio item.

    o I believe there's a special doxygen command \deprecated
      for deprecating a method; let's use that in the docs for the
      older setonly() method.
 
 
#4 AlbrechtS
15:59 Jan 09, 2015
I agree that Manolo's patch looks good.

I'll do my tests tomorrow or on Sunday (if time permits).
 
 
#5 AlbrechtS
07:11 Jan 11, 2015
My modified test program 'test-bug_as.cxx' (as discussed in fltkcoredev, but with one additional "feature") shows another potential bug in the 'v2' patch.

The bug happens if

  first = first_submenu_item(item, menu_);

returns NULL in Fl_Menu_::setonly(Fl_Menu_Item* item). However 'first' is used in the next loop, regardless of the returned value. The easy fix was to 'return' if this happens, but this was not the only issue.

When does this happen? Usually not. However a common programming error would be to use a stale or other unrelated pointer to the menu item in

  Fl_Menu_::setonly(Fl_Menu_Item* item)

I consider this a *common* problem because Fl_Menu_::add() copies the menu array to malloc'd memory and renders all stored pointers useless (and dangerous).

My test program causes a crash if the statement '#if 1' below the comment "MAKE IT CRASH" is active (as opposed to '#if 0'). Only the "new way" for setting the "Red" button is concerned, so the next '#if' below this must be '#if 0' (new way) to really cause a crash. As in Greg's version you need to pick the "One" submenu entry to see the effect.

The "old way" would "happily" modify the old, stale menu array (or anything else the stale pointer points to), but it would only do any harm if the radio item group was at the beginning of the item array. Or if the menu item pointer points to something totally unrelated.

The "new way" would change the array in the forward direction, but crash when it comes to walking back. Although my patch fixes the latter, I believe that we should first try to find the 'first' menu entry before we change anything. This way Fl_Menu_::setonly(item) would not do any harm if the pointer was unrelated (stale). We should probably also not _set_ the menu button if we can't find the 'first' radio button of the group to prevent even more hazard.

This is not (yet) in my v3 patch (see follow-up) - I'm asking for your comments first.

More to follow...
 
 
#6 AlbrechtS
07:23 Jan 11, 2015
The 2nd part of the v3 patch fixes an issue in new function 'first_submenu_item()'. I'm not sure if this is the best fix, maybe we could optimize it. However, it does its job in my tests.

Here is the incremental patch (inline):

diff --git a/src/Fl_Menu_.cxx b/src/Fl_Menu_.cxx
index 7282ab4..9d5ffdd 100644
--- a/src/Fl_Menu_.cxx
+++ b/src/Fl_Menu_.cxx
@@ -290,6 +290,9 @@ static Fl_Menu_Item *first_submenu_item(Fl_Menu_Item *item, Fl_Menu_Item *start,
       Fl_Menu_Item *first;
       if (j->flags&FL_SUBMENU_POINTER) {
         first = (Fl_Menu_Item*)j->user_data();
+       first = first_submenu_item(item, first);        // v3
+       if (first) return first;                        // v3
+       first = j + 1;                                  // v3
       } else {
         first = j + 1;
       }
@@ -318,6 +321,7 @@ void Fl_Menu_::setonly(Fl_Menu_Item* item) {
   } else { // FL_SUBMENU_POINTER is involved
     first = first_submenu_item(item, menu_);
   }
+  if (!first) return;                          // STR #3176: v3
   for (j = item-1; j>=first; j--) { // go up
     //DEBUG printf("GO UP: WORKING ON: item='%s', flags=%x\n", j->text, j->flags);
     if (!j->text || (j->flags&FL_MENU_DIVIDER) || !j->radio()) break;


The upper part is the part I'm talking about now. The v2 version would return a new value of j (when called with a valid 3rd argument) that points beyond the submenu as the new 'next' item, if the item is not included in the submenu. This is correct in the case of a normal submenu, but points outside the menu array in a 'detached' submenu (i.e. when using FL_SUBMENU_POINTER). My patch fixes this.

I'm not sure if this is all, but hopefully it is.

Manolo, could you take a look at my fix and check if there is a better way?

The full patch is appended in file 'v3.patch'.
 
 
#7 AlbrechtS
07:25 Jan 11, 2015
One more note: the patch contains changes to test/menubar.cxx.

I'm not sure if we want to do this in the final fix. I don't see a problem, but just wanted to ask for opinions.
 
 
#8 AlbrechtS
07:31 Jan 11, 2015
FYI: the stale pointer I'm using in my test case to cause the crash is Fl_Menu_Item *G_item. This pointer is stored in the initialization part of the program, but it is still used after adding the 'crash' menu in the callback when 'One' is picked.  
 
#9 AlbrechtS
08:23 Jan 11, 2015
Uploaded file 'v4.patch' includes only changes in FL/* and src/*, I dropped the change in test/menubar.cxx for shortness.

v3.patch accidentally contained Greg's test file. Please ignore.

Other changes in v4: the order of checking whether 'item' is valid (contained in the menu) and setting and clearing menu button states has been reversed, as suggested previously.

I tested even more and modified the test program slightly. It still seems to work most of the time, but after repeated actions I can still crash the program. I'm not sure if this is a program error or another bug in the library code.

Anyway: the crash appears to be in the first recursion (2nd call) of first_submenu_item().  :-(

Following is the output and the backtrace, in case it's useful for someone. I'm taking a break now...

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Joe (userdata=0x702)
One (userdata=0x201)
menu = menubar
menu item 'crash' added...
-- NEW way: mw->setonly(G_item);
One (userdata=0x201)
menu = menubar
menu item 'crash' added...
-- NEW way: mw->setonly(G_item);
Two (userdata=0x202)
One (userdata=0x201)
menu = menu2
menu item 'crash' added...
-- NEW way: mw->setonly(G_item);
One (userdata=0x201)
menu = menubar
menu item 'crash' added...
-- NEW way: mw->setonly(G_item);

Program received signal SIGSEGV, Segmentation fault.
0x000000000040d31d in first_submenu_item(Fl_Menu_Item*, Fl_Menu_Item*, Fl_Menu_Item**) ()
(gdb) bt
#0  0x000000000040d31d in first_submenu_item(Fl_Menu_Item*, Fl_Menu_Item*, Fl_Menu_Item**) ()
#1  0x000000000040d345 in first_submenu_item(Fl_Menu_Item*, Fl_Menu_Item*, Fl_Menu_Item**) ()
#2  0x000000000040d830 in Fl_Menu_::setonly(Fl_Menu_Item*) ()
#3  0x000000000040557e in menu_cb(Fl_Widget*, void*) ()
#4  0x000000000040d908 in Fl_Menu_::picked(Fl_Menu_Item const*) ()
#5  0x000000000040db7e in Fl_Menu_Bar::handle(int) ()
#6  0x000000000040869a in Fl_Group::handle(int) ()
#7  0x000000000040584b in send(int, Fl_Widget*, Fl_Window*) ()
#8  0x0000000000406c91 in Fl::handle_(int, Fl_Window*) ()
#9  0x0000000000415f90 in fl_handle(_XEvent const&) ()
#10 0x000000000041603e in do_queued_events() ()
#11 0x0000000000416242 in fl_wait(double) ()
#12 0x0000000000406a80 in Fl::wait(double) ()
#13 0x0000000000406aa2 in Fl::run() ()
#14 0x00000000004056eb in main ()
(gdb)

Here are the relevant changes of the test program:

#if 1 // crash the patched version (v2)

  mw->add( // this copies the menu() elsewhere
    "crash", 0, menu_cb, (void *)0x800, FL_MENU_TOGGLE);
  printf ("menu item 'crash' added...\n");

#endif // (crash in v2, fixed in v3)

This adds the 'crash' menu to the selected menu (mw) instead of always menubar, as it was previously. The crash seems to appear only after use *both* 'One' menus and repeat one of them. I'm aware that this would add the 'crash' menu item twice, but I don't see it twice in any menu. Does add() check if a menu entry with that label exists?
 
 
#10 greg.ercolano
11:34 Jan 11, 2015
Interesting side note: I was looking at some of the other Fl_Menu_ code;
it seems the FL_SUBMENU_POINTER case is not taken into account in many
of the other unrelated methods (e.g. int find_item(item), item_pathname())

Seems like all the code needs to be checked for handling this situation,
as there's definitely some assumptions being made that all items
are within the menu_ array.
 
 
#11 manolo
09:26 Mar 19, 2015
I believe v5.patch is the good one.

The internal first_submenu_item() function has been changed
a little: it returns the last menu item (instead of the
item after that). Therefore there's no risk of getting
a pointer that wanders outside the menu.

Also, Fl_Menu_::setonly(Fl_Menu_Item* item) calls first thing
first_submenu_item() and returns without change if item
does not belong to the menu.
 
 
#12 AlbrechtS
17:14 Mar 19, 2015
Hi Manolo,

I looked at your patch (v5) and I believe that it is so far okay. However I had something in mind anyway, and so I implemented a variation of your first_submenu_item() function that is IMHO simpler, less code, and better to understand (no extra return argument #3).

Greg and Manolo, please take a look at 'v6.patch'. I tested it and it looks good, but it's too late now for more tests. The only difference between v5 and v6 is the first_submenu_item() function. I'm listing it here for discussion:

static Fl_Menu_Item *first_submenu_item(Fl_Menu_Item *item, Fl_Menu_Item *start) {

  Fl_Menu_Item *end = start + start->size();

  if (item >= start && item < end) return start; // this line is optional [1a]

  for (Fl_Menu_Item *j = start; j<end; j++) {

    if (j == item) return start; // this is optional [1b]

    if (j->flags&FL_SUBMENU_POINTER) {
      Fl_Menu_Item *first = first_submenu_item(item, (Fl_Menu_Item*)j->user_data());
      if (first) return first;
    }
  }
  return NULL;
}

Some details:

First of all I believe that we don't need the exact start of the submenu of the radio buttons; we only need a limit how far the search is allowed to go back (the missing part in the old software). If it is the first menu item, that's enough, because the backwards search in setonly() will stop at submenu boundaries anyway. This makes my version somewhat simpler.

The start->size() method gives us the number of menu entries across submenus in /this/ menu array w/o "detached" FL_SUBMENU_POINTER submenus, but with all "regular" submenus.

The "optional" line [1a] makes the search terminate if the item is within the address boundaries [start,end). This is a simple and fast comparison. If it is false, then we must only find out if there is a FL_SUBMENU_POINTER entry in the list and search that submenu, and so on, recursively. The for(...) loop does this.

The "optional" line [1b] can be removed if [1a] is used: if condition [1a] is false (i.e. the loop is executed), then [1b] can't be true. Hence I believe that [1a] should be used and [1b] can be removed.

Did I miss anything? Comments?
 
 
#13 manolo
14:53 Mar 20, 2015
I believe it's not efficient to use Fl_Menu_::size()
because its implemented (in Fl_Menu_Item::size(), really)
by walking across all menu items, one by one.
It's preferable to test for the presence of item while
walking across the menu.
Without the end value, the code becomes v5.patch.
 
 
#14 manolo
14:59 Mar 20, 2015
In comment #13 above, the statement about Fl_Menu_::size()
is not relevant, because Fl_Menu_Item::size() is used in v6.patch.
But the main argument remains unchanged.
 
 
#15 AlbrechtS
09:06 Mar 21, 2015
I believe that both patches (v5.patch and v6.patch w/o line [1b]) are good and solve the issue.

I don't believe that performance is an issue here though. setonly() is usually only called when a user clicks a radio button, and size() is very quick (simple comparisons).

Anyway, I was curious and did some tests. See the results in uploaded file performance-test.txt. Note that the menu used in this test had ~12 menu items including one FL_SUBMENU_POINTER and the terminating {0} (a modified version of test-bug_as.cxx (file #4).

Although there is a small performance penalty using v6a.patch (~14%) I'd still prefer this one because of simplicity and therefore better maintainability (remember the first version that didn't work because of wrong indexing).

The measurement done suggests that it is only a matter of microseconds.

That said: please feel free to use any patch you like (v5 or "v6a", I wouldn't mind) or wait for Greg to tell his opinion.

IMHO we should get that fixed and this STR closed ASAP for release 1.3.4. TIA
 
 
#16 greg.ercolano
09:45 Mar 21, 2015
After a loong week of work, I have some time today to put towards FLTK, so I'll take a look.  
 
#17 manolo
12:29 Mar 21, 2015
The best of both worlds :

static Fl_Menu_Item *first_submenu_item(Fl_Menu_Item *item, Fl_Menu_Item *start)
{
  Fl_Menu_Item* m = start;
  int nest = 0;
  while (true) {
    if (!m->text) {
      if (!nest) return NULL;
      nest--;
    } else {
      if (m == item) return start;
      if (m->flags & FL_SUBMENU_POINTER) {
        Fl_Menu_Item *first = first_submenu_item(item, (Fl_Menu_Item*)m->user_data());
        if (first) return first;
      }
      else if (m->flags & FL_SUBMENU) {
        nest++;
      }
    }
    m++;
  }
}
 
 
#18 AlbrechtS
13:19 Mar 21, 2015
Great! Looks good!  
 
#19 greg.ercolano
13:53 Mar 21, 2015
If you guys agree, then 'problem solved' is my opinion ;)

Regarding range checking on the pointer addresses from the app (comment #5)
I'm kinda -1 on that.

For one thing, addresses can be reused, and therefore a stale pointer
could still be 'valid'. So it's not a guaranteed solution.

Also, I recall submitting a patch a loong time ago that did range checks
on app supplied data in all my methods, and the response from Bill + Matt
was don't do all those range checks unless absolutely necessary.

Mainly for speed and to keep the code clean + simple. While speed sounds
like a non-issue (a few simple ptr compares don't add up to much), it adds up if
you spread this philosophy across the entire lib, and factor in tight loops
calling these methods, speed does become an issue, and a lot of needless
repeated range checks are done.

So the general philosophy was: keep the code simple, and don't try to
protect the user from themselves: garbage in, garbage out.

Only do what the docs guarantee, and only make a guarantee if it
/really/ makes sense to do so, and is not just a casual convenience.
Just because we can do the checks doesn't mean we should; it could
prevent us from doing optimizations down the line if we contract with
the user we check for wild ptrs.

Since most of the lib is written to not range check for wild pointers,
and avoiding that helps for speed, we should stick with the precedent IMHO.
 
 
#20 manolo
00:10 Mar 22, 2015
Fixed in Subversion repository.  
     

Return to Bugs & Features ]

 
 

Comments are owned by the poster. All other content is copyright 1998-2024 by Bill Spitzak and others. This project is hosted by The FLTK Team. Please report site problems to 'erco@seriss.com'.