Item10672: Foswiki can crash mongod...

pencil
Priority: Normal
Current State: Closed
Released In: n/a
Target Release: n/a
Applies To: Extension
Component: MongoDBPlugin
Branches:
Reported By: PaulHarvey
Waiting For: PaulHarvey
Last Change By: PaulHarvey
Related to Item10670

I managed to crash our mongod primary doing an updateweb of ~80k topics (it reached ~55k at ~08:32 and died).

Couldn't shutdown mongo with service ... stop or SIGTERM (you can see it got a signal at 08:34:27, but never completed shutdown).

I had to use killall -KILL mongod to make it die.

I then ran a dbrepair and restarted. It spent some time in state ROLLBACK before going RECOVERING and then eventually SECONDARY.

Attached is the log, for what it's worth... it's possible there is nothing we can do for this task, except collect this kind of crash information.

I'm using a 3-member replSet as follows

/usr/bin/mongod --dbpath /var/lib/mongodb/seta --logpath /var/log/mongodb/seta.log run --config /etc/mongodb/seta.conf --port 27017 --replSet seta

And /etc/mongodb/seta.conf contains simply:
logappend=true
noprealloc = true
smallfiles = true
directoryperdb = true

Version:
mongod --version
db version v1.8.1, pdfile version 4.5
Sat Apr 23 09:58:43 git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04

replset config:
mongo mongo1.wiki.org
MongoDB shell version: 1.8.1
connecting to: localhost
seta:SECONDARY> rs.status()
{
   "set" : "seta",
   "date" : ISODate("2011-04-22T23:59:15Z"),
   "myState" : 2,
   "members" : [
      {
         "_id" : 0,
         "name" : "mongo1.wiki.org:27017",
         "health" : 1,
         "state" : 2,
         "stateStr" : "SECONDARY",
         "optime" : {
            "t" : 1303514243000,
            "i" : 8
         },
         "optimeDate" : ISODate("2011-04-22T23:17:23Z"),
         "self" : true
      },
      {
         "_id" : 1,
         "name" : "mongo2.wiki.org:27017",
         "health" : 1,
         "state" : 1,
         "stateStr" : "PRIMARY",
         "uptime" : 4925,
         "optime" : {
            "t" : 1303516753000,
            "i" : 4
         },
         "optimeDate" : ISODate("2011-04-22T23:59:13Z"),
         "lastHeartbeat" : ISODate("2011-04-22T23:59:13Z")
      },
      {
         "_id" : 2,
         "name" : "mongo3.wiki.org:27017",
         "health" : 1,
         "state" : 2,
         "stateStr" : "SECONDARY",
         "uptime" : 4925,
         "optime" : {
            "t" : 1303514243000,
            "i" : 8
         },
         "optimeDate" : ISODate("2011-04-22T23:17:23Z"),
         "lastHeartbeat" : ISODate("2011-04-22T23:59:15Z"),
         "errmsg" : "syncThread: 10156 cannot update system collection: foswiki__System.MigrationScriptsInsectsDemo.system.indexes q: { ns: \"foswiki__System.MigrationScriptsInsectsDemo.current\", key: { _topic: 1 }, name: \"_topic\", v: 0 } u: { ns: \"foswiki__System.MigrationScr"
      }
   ],
   "ok" : 1
}
seta:SECONDARY>

The log has been slightly anonymised (IP addresses replaced with mongo[1-3].wiki.org).

-- PaulHarvey - 22 Apr 2011

And I don't know how to fix the "cannot update system collection" error...

-- PaulHarvey - 23 Apr 2011

Perhaps we can also use this task to get to the bottom of why there's all these ClientCursor::yield can't unlock b/c of recursive lock ns: foswiki__TaxonProfile__Samples__Eremophila.system.js

-- PaulHarvey - 23 Apr 2011

This kind of error:
 "Sun May  8 12:07:43 [replica set sync] replSet syncThread: 10156 cannot update system collection: foswiki__TaxonProfile.Relationships.system.indexes q: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_top
ic", background: true, v: 0 } u: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 }

... seems to be happening under normal circumstances, not just during a bulk load. When it does, their errMsg stays stuck on this, but the status is still SECONDARY, which is misleading because they actually have stopped replicating. I think. So when I restart the primary, when it comes back, the secondaries suddenly realise their data is stale and the oplog can't help them. Their status changes to
error RS102 too stale to catch up
.

In this case I have to delete /var/lib/mongodb/seta/* and re-start them (eventually they sync up again under their own steam).

-- PaulHarvey - 08 May 2011

It's as if they've got "stuck" with that "cannot update system collection" error, but then knocking out the primary and forcing the trio to have another election suddenly makes them want to try to continue syncing again (but by the time I kick the PRIMARY the secondaries are already too stale to catch up).

-- PaulHarvey - 08 May 2011

Hmm, I think I've confirmed my suspicions. Both of the SECONDARY's optimes were stuck on 2011-05-05T05:37:08Z aka 15:37 Canberra time (btw the PRIMARY optime was 2011-05-08T08:19:33Z). Apart from connection notices, the only thing in the mongo logs was this:

Thu May  5 15:37:08 [replica set sync] building new index on { _id: 1 } for foswiki__TaxonProfile.Relationships.system.js
Thu May  5 15:37:08 [replica set sync] done for 0 records 0.007secs
Thu May  5 15:37:08 [replica set sync] replSet syncThread: 10156 cannot update system collection: foswiki__TaxonProfile.Relationships.system.indexes q: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 } u: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 }
Thu May  5 15:37:15 [conn4712] end connection wiki.example.org:47840
Thu May  5 15:37:19 [replica set sync] replSet syncThread: 10156 cannot update system collection: foswiki__TaxonProfile.Relationships.system.indexes q: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 } u: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 }
Thu May  5 15:37:30 [replica set sync] replSet syncThread: 10156 cannot update system collection: foswiki__TaxonProfile.Relationships.system.indexes q: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 } u: { ns: "foswiki__TaxonProfile.Relationships.current", key: { _topic: 1 }, name: "_topic", background: true, v: 0 }

So there ya go...

-- PaulHarvey - 08 May 2011

Okay, so there's three distinct issues here
  • During bulk-update, we need to protect "slow pages" on the wiki - deep/expensive queries during bulk import seem to increase chance of crashyness.
  • Sometimes a db goes screwy, for example, right now on TaxonProfile/Builder web all the searches are giving Could not perform search. Error was: query error: js init failed or Could not perform search. Error was: query error: JS_NewObject failed: toJSObject2 (2011-05-08T23:10 canberra time) - and other searches are working?
  • MongoDB itself seems to have issues setting up an index on the system.js collection, sometimes. I've posted to the mongodb-users list

-- PaulHarvey - 08 May 2011
Argh, the PRIMARY crashed again last night. Happily nobody noticed, the secondaries did their job.

We started getting JS Error: out of memory errors.

Sun May  8 20:46:27 [initandlisten] connection accepted from wiki.example.org:52997 #388
Sun May  8 20:46:28 [conn388] query foswiki__Mangroves.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "Vascular_Plants" } } reslen:64 204ms
Sun May  8 20:48:03 [conn387] query foswiki__Main.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "TWikiUsers" } } reslen:64 215ms
Sun May  8 20:49:25 [conn388] query foswiki__Mangroves.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "Vascular_Plants" } } reslen:64 205ms
Sun May  8 20:50:55 [conn387] query foswiki__Mangroves.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "AllOrganisms" } } reslen:64 199ms
Sun May  8 20:51:00 [conn387] info DFM::findAll(): extent 0:41e00 was empty, skipping ahead foswiki__Marine__Invertebrates.current
Sun May  8 20:51:51 [conn388] query foswiki__Keys.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "EuclidKey" } } reslen:64 213ms
Sun May  8 20:51:58 [conn387] JS Error: out of memory
Sun May  8 20:51:58 [conn387]  foswiki__CapacityBuilding.current Assertion failure JS_InitClass( cx , global , 0 , &numberlong_class , numberlong_constructor , 0 , 0 , numberlong_functions , 0 , 0 ) scripting/sm_db.cpp 1045
0x54e8de 0x55f971 0x610b38 0x61c5c7 0x5e34cf 0x73d993 0x7902f6 0x795a63 0x799715 0x79bc54 0x79c7f7 0x643578 0x7f169d 0x7db210 0x7dc741 0x642c25 0x6477c6 0x752225 0x757938 0x8a3b3e 
 /usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0xae) [0x54e8de]
 /usr/bin/mongod(_ZN5mongo8assertedEPKcS1_j+0x111) [0x55f971]
 /usr/bin/mongod(_ZN5mongo11initMongoJSEPNS_7SMScopeEP9JSContextP8JSObjectb+0x7e8) [0x610b38]
 /usr/bin/mongod(_ZN5mongo7SMScope12localConnectEPKc+0x77) [0x61c5c7]
 /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x55f) [0x5e34cf]
 /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x799715]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
 /usr/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3b8) [0x643578]
 /usr/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xad) [0x7f169d]
 /usr/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x9a0) [0x7db210]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dc741]
 /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x642c25]
 /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x2e86) [0x6477c6]
 /usr/bin/mongod [0x752225]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
 /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
Sun May  8 20:51:58 [conn387] Count with ns: foswiki__CapacityBuilding.current and query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "WebTools" } failed with exception: exception: 0 assertion scripting/sm_db.cpp:1045
Sun May  8 20:51:58 [conn387] query foswiki__CapacityBuilding.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "WebTools" } } reslen:64 1120ms
Sun May  8 20:51:58 [conn387] JS Error: out of memory

This continues for hours, until we get:
Mon May  9 00:48:58 [conn433] JS Error: out of memory
Mon May  9 00:48:58 [conn433] compile failed for: function(dateString) {
    if (dateString == undefined) {
        return dateString;
    }

<... snip lots of JS ...>

    return parsedDate.getTime(); //return seconds since 1970
}
Mon May  9 00:48:58 Invalid access at address: 0

Mon May  9 00:48:58 Got signal: 11 (Segmentation fault).

Mon May  9 00:48:58 Backtrace:
0x8a2e59 0x8a3430 0x7fed99402080 0x90d770 0x62178d 0x622d96 0x604724 0x60540e 0x5e346b 0x73d993 0x7902f6 0x795a63 0x799715 0x79bc54 0x79c7f7 0x64543b 0x752225 0x757938 0x8a3b3e 0x8b6a40 
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x8a2e59]
 /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x220) [0x8a3430]
 /lib/libpthread.so.0 [0x7fed99402080]
 /usr/bin/mongod(JS_GetFunctionObject+0xc) [0x90d770]
 /usr/bin/mongod(_ZN5mongo9Convertor5tovalERKNS_11BSONElementE+0xb2d) [0x62178d]
 /usr/bin/mongod(_ZN5mongo7SMScope10setElementEPKcRKNS_11BSONElementE+0x46) [0x622d96]
 /usr/bin/mongod(_ZN5mongo5Scope10loadStoredEb+0x3a4) [0x604724]
 /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0xee) [0x60540e]
 /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e346b]
 /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x799715]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
 /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xafb) [0x64543b]
 /usr/bin/mongod [0x752225]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
 /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
 /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]

Mon May  9 00:48:58 dbexit: 
Mon May  9 00:48:58 [conn433] shutdown: going to close listening sockets...
Mon May  9 00:48:58 [conn433] closing listening socket: 6
Mon May  9 00:48:58 [conn433] closing listening socket: 7
Mon May  9 00:48:58 [conn433] closing listening socket: 8
Mon May  9 00:48:58 [conn433] closing listening socket: 9
Mon May  9 00:48:58 [conn433] removing socket file: /tmp/mongodb-27017.sock
Mon May  9 00:48:58 [conn433] removing socket file: /tmp/mongodb-28017.sock
Mon May  9 00:48:58 [conn433] shutdown: going to flush diaglog...
Mon May  9 00:48:58 [conn433] shutdown: going to close sockets...
Mon May  9 00:48:58 [conn2] end connection 150.229.2.121:34725
Mon May  9 00:48:58 [conn372] Assertion: 13073:shutting down
0x55ece9 0x751518 0x7587bb 0x8a3b3e 0x8b6a40 0x7fed993fa3ba 0x7fed989bd02d 
 /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
 /usr/bin/mongod(_ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE+0x7f8) [0x751518]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x143b) [0x7587bb]
 /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
 /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
 /lib/libpthread.so.0 [0x7fed993fa3ba]
 /lib/libc.so.6(clone+0x6d) [0x7fed989bd02d]
Mon May  9 00:48:58 [conn373] Assertion: 13073:shutting down
0x55ece9 0x751518 0x7587bb 0x8a3b3e 0x8b6a40 0x7fed993fa3ba 0x7fed989bd02d 
 /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
 /usr/bin/mongod(_ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE+0x7f8) [0x751518]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x143b) [0x7587bb]
 /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
 /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
 /lib/libpthread.so.0 [0x7fed993fa3ba]
 /lib/libc.so.6(clone+0x6d) [0x7fed989bd02d]
Mon May  9 00:48:58 [conn1] end connection 127.0.0.1:45412
Mon May  9 00:48:58 [conn433] shutdown: waiting for fs preallocator...
Mon May  9 00:48:58 [conn433] shutdown: closing all files...
Mon May  9 00:48:58 [conn372] AssertionException in connThread, closing client connection
Mon May  9 00:48:58 [conn372]  shutting down
Mon May  9 00:48:58 [conn373] AssertionException in connThread, closing client connection
Mon May  9 00:48:58 [conn373]  shutting down
Mon May  9 00:48:59 closeAllFiles() finished
Mon May  9 00:48:59 [conn433] shutdown: removing fs lock...
Mon May  9 00:48:59 dbexit: really exiting now
Mon May  9 00:48:59 ERROR: Client::~Client _context should be null but is not; client:conn
-- PaulHarvey - 09 May 2011

Scott seems to think that periods in the DB names are a problem. Can we filter them out? I've created Item10725 to tackle the JS out of memory/ NewObject error

-- PaulHarvey - 09 May 2011

An idea from the list - can we use nolock so that db.eval isn't blocking? Apparently the replicas cannot read ops from the primary during a db.eval (unless nolock is used)

-- PaulHarvey - 09 May 2011

ok, I'm removing the periods - I don't really know why they're there - I thought you used / exclusively

-- SvenDowideit - 18 May 2011

We do try... haven't got any robust method of properly normalising everything :/

-- PaulHarvey - 18 May 2011

I've made some changes to the js in getRef() that might help - will need to wait for testing onsite tho :/

-- SvenDowideit - 19 May 2011

Okay, I've got two crashes now. Sadly the second was a segfault kinda crash, but I didn't have -vvv on frown, sad smile

Here's the first, from Foswiki's debug.log:
| 2011-05-19T15:17:12Z debug | Indexing attachment Mangroves.SemanticLinksReport.igp_325_500_220_DirectedGraphPlugin_1.png |
| 2011-05-19T15:30:53Z debug | SEARCH crash: Files.PublicCPBR: SEARCH{"parent.name='PublicCPBR' "...
Caused the following error:
query error: JS_NewObject failed for global
-------
 |

And the corresponding MongoDB bit:
Fri May 20 01:30:51 [conn418] query: foswiki__Files.current{ _history: { $exists: 0 }, _web: "Files", _topic: "WebHome" }
Fri May 20 01:30:51 [conn418] query foswiki__Files.current ntoreturn:1 reslen:10630 nreturned:1 0ms
Fri May 20 01:30:51 [conn418] query: foswiki__Files.$cmd{ count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "PublicCPBR" } }
Fri May 20 01:30:51 [conn418] run command foswiki__Files.$cmd { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "PublicCPBR" } }
Fri May 20 01:30:51 [conn418] JS Error: out of memory
Fri May 20 01:30:51 [conn418] Assertion: 10433:js init failed
0x55ece9 0x61cbb1 0x610cc3 0x606977 0x605579 0x5e346b 0x73d993 0x7902f6 0x795a63 0x79951b 0x79bc54 0x79c7f7 0x643578 0x7f169d 0x7db210 0x7dc741 0x642c25 0x6477c6 0x752225 0x757938 
Fri May 20 01:30:52 [conn415] query: admin.$cmd{ listDatabases: 1 }
Fri May 20 01:30:52 [conn415] run command admin.$cmd { listDatabases: 1 }
Fri May 20 01:30:52 [conn415] command: { listDatabases: 1 }
Fri May 20 01:30:52 [conn415] checking size file /var/lib/mongodb/seta/foswiki__Marine__SeaSlugs/foswiki__Marine__SeaSlugs.ns
query admin.$cmd ntoreturn:1 command: { listDatabases: 1 } reslen:7949 5ms
Fri May 20 01:30:52 [conn415] query: foswiki__System.current{ _history: { $exists: 0 }, _web: "System", _topic: "DefaultPreferences" }
Fri May 20 01:30:52 [conn415]   running multiple plans
Fri May 20 01:30:52 [conn415] query foswiki__System.current ntoreturn:1 reslen:56856 nreturned:1 0ms
Fri May 20 01:30:52 [conn415] query: admin.$cmd{ listDatabases: 1 }
Fri May 20 01:30:52 [conn415] run command admin.$cmd { listDatabases: 1 }
Fri May 20 01:30:52 [conn415] command: { listDatabases: 1 }
Fri May 20 01:30:52 [conn415] checking size file /var/lib/mongodb/seta/foswiki__Marine__SeaSlugs/foswiki__Marine__SeaSlugs.ns
query admin.$cmd ntoreturn:1 command: { listDatabases: 1 } reslen:7949 5ms
Fri May 20 01:30:52 [conn415] query: foswiki__System.current{ _history: { $exists: 0 }, _web: "System", _topic: "CompareRevisionsAddonPlugin" }
Fri May 20 01:30:52 [conn415]   running multiple plans
Fri May 20 01:30:52 [conn415] query foswiki__System.current ntoreturn:1 reslen:36 nreturned:0 0ms
Fri May 20 01:30:52 [conn415] query: admin.$cmd{ listDatabases: 1 }
Fri May 20 01:30:52 [conn415] run command admin.$cmd { listDatabases: 1 }
Fri May 20 01:30:52 [conn415] command: { listDatabases: 1 }
 /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
 /usr/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x231) [0x61cbb1]
 /usr/bin/mongod(_ZN5mongo8SMEngine11createScopeEv+0x23) [0x610cc3]
 /usr/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x17) [0x606977]
 /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x259) [0x605579]
 /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e346b]
 /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
 /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x5b) [0x79951b]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
 /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
 /usr/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3b8) [0x643578]
 /usr/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xad) [0x7f169d]
 /usr/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x9a0) [0x7db210]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dc741]
 /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x642c25]
 /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x2e86) [0x6477c6]
 /usr/bin/mongod() [0x752225]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
Fri May 20 01:30:52 [conn415] checking size file /var/lib/mongodb/seta/foswiki__Marine__SeaSlugs/foswiki__Marine__SeaSlugs.ns
query admin.$cmd ntoreturn:1 command: { listDatabases: 1 } reslen:7949 6ms
Fri May 20 01:30:52 [conn415] query: foswiki__System.current{ _history: { $exists: 0 }, _web: "System", _topic: "ImagePlugin" }
Fri May 20 01:30:52 [conn415]   running multiple plans
Fri May 20 01:30:52 [conn415] query foswiki__System.current ntoreturn:1 reslen:36 nreturned:0 0ms

-- PaulHarvey - 20 May 2011

The second crash's segfault.

So the timeline was (Foswiki 40s faster than mongod):
  • Foswiki, 02:17:19 - Mangroves.SemanticLinksReport cron job run is finishing up and saving topics
  • mongod First problem in mongo log, is a huge flood of:
    Fri May 20 12:17:03 [conn48] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: foswiki__TaxonProfile__Relationships.current
    repeated ~100 times in that single second
  • mongod Again, but this time ending in a JS_NewObject failed - the first time for this mongod instance:
    Fri May 20 12:25:08 [conn52] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: foswiki__TaxonProfile__Relationships.current
     Fri May 20 12:25:08 [conn53] update foswiki__Mangroves.current  query: { address: "Mangroves.SemanticLinksReportList@324" } 124ms
     Fri May 20 12:25:08 [conn53] end connection 150.229.2.48:43264
     Fri May 20 12:25:09 [conn52] query foswiki__TaxonProfile__Relationships.current scanAndOrder  reslen:36 nscanned:2384 { orderby: { FIELD.Relationship.value: 1 }, query: { _history: { $exists: 0 }, $where: " ( (1) )  &&  ( ( ( ( ( foswiki_getField(this, 'FORM.name') == 'TaxonP..." } }  nreturned:0 1905ms
     Fri May 20 12:25:55 [initandlisten] connection accepted from 150.229.2.48:36732 #59
     Fri May 20 12:26:04 [conn59] query foswiki__Marine__Sponges__Specimens.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "G324727" } } reslen:64 173ms
     Fri May 20 12:26:37 [conn58] query foswiki__C4C.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 255ms
     Fri May 20 12:26:37 [conn58] query foswiki__HubRIS__HubRISInternal.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 281ms
     Fri May 20 12:26:38 [conn58] query foswiki__Marine__Fish.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 187ms
     Fri May 20 12:26:39 [conn58] query foswiki__Sandbox.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 263ms
     Fri May 20 12:26:39 [conn58] query foswiki__TemporaryAddressTestsTestWeb__SubWeb__SubSubWeb.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 308ms
     Fri May 20 12:26:40 [conn58] query foswiki__Weeds.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name:  "PERM_ID_HUMAN" } } } } reslen:64 261ms
     Fri May 20 12:26:40 [conn58] JS Error: out of memory
     Fri May 20 12:26:41 [conn58] JS Error: out of memory
     Fri May 20 12:26:41 [conn58] Assertion: 10432:JS_NewObject failed for global
     0x55ece9 0x61cbc7 0x610cc3 0x606977 0x605579 0x5e346b 0x73d993 0x7902f6 0x795a63 0x799715 0x79bc54 0x79c7f7 0x643578 0x7f169d 0x7db210 0x7dc741 0x642c25 0x6477c6 0x752225 0x757938 
     /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
     /usr/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x247) [0x61cbc7]
     /usr/bin/mongod(_ZN5mongo8SMEngine11createScopeEv+0x23) [0x610cc3]
     /usr/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x17) [0x606977]
     /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x259) [0x605579]
     /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e346b]
     /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x799715]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
     /usr/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3b8) [0x643578]
     /usr/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xad) [0x7f169d]
     /usr/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x9a0) [0x7db210]
     /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dc741]
     /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x642c25]
     /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x2e86) [0x6477c6]
     /usr/bin/mongod() [0x752225]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
     Fri May 20 12:26:41 [conn58] Count with ns: foswiki__Weeds__WeedsInternal.current and query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value: "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } failed with exception: exception: 10432 JS_NewObject failed for global
     Fri May 20 12:26:41 [conn58] query foswiki__Weeds__WeedsInternal.$cmd ntoreturn:1 command: { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", PREFERENCE.__RAW_ARRAY: { $elemMatch: { value:  "TaxonProfile_efinitions_SeaSlugs_Species_ViewTemplate", name: "PERM_ID_HUMAN" } } } } reslen:64 868ms
     Fri May 20 12:26:41 [conn58] JS Error: out of memory
     Fri May 20 12:26:41 [conn58] Assertion: 10432:JS_NewObject failed for global
     0x55ece9 0x61cbc7 0x610cc3 0x606977 0x605579 0x5e346b 0x73d993 0x7902f6 0x795a63 0x79951b 0x79bc54 0x79c7f7 0x64543b 0x752225 0x757938 0x8a3b3e 0x8b6a40 0x7f93925b99ca 0x7f9391b6870d 
     /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
     /usr/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x247) [0x61cbc7]
     /usr/bin/mongod(_ZN5mongo8SMEngine11createScopeEv+0x23) [0x610cc3]
     /usr/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x17) [0x606977]
     /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x259) [0x605579]
     /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e346b]
     /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x5b) [0x79951b]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
     /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xafb) [0x64543b]
     /usr/bin/mongod() [0x752225]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
     /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
     /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
     /lib/libpthread.so.0(+0x69ca) [0x7f93925b99ca]
     /lib/libc.so.6(clone+0x6d) [0x7f9391b6870d]
     Fri May 20 12:26:41 [conn58] JS Error: out of memory
     Fri May 20 12:26:41 [conn58] Assertion: 10432:JS_NewObject failed for global
    
  • Foswiki First problem in Foswiki debug log:
    | 2011-05-20T02:27:21Z debug | SEARCH crash: TaxonProfile/Builder.TermsWithoutMappings: SEARCH{"form.name~'*TermForm' AND TermBasis~'*PlumeMoths:Specimens*'"...
     Caused the following error:
     query error: JS_NewObject failed: toJSObject2
     -------
  • mongod The sequence of JS Error: out of memory keeps repeating with backtraces on all, until:
    Fri May 20 12:29:56 [conn58] JS Error: out of memory
     Fri May 20 12:29:56 [conn58] Assertion: 13072:JS_NewObject failed: toJSObject2
     0x55ece9 0x61d2aa 0x61d4ca 0x5ec7dc 0x73dd3c 0x65b06e 0x78d476 0x795f91 0x799715 0x79bc54 0x79c7f7 0x64543b 0x752225 0x757938 0x8a3b3e 0x8b6a40 0x7f93925b99ca 0x7f9391b6870d 
     /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
     /usr/bin/mongod(_ZN5mongo9Convertor10toJSObjectEPKNS_7BSONObjEb+0x34a) [0x61d2aa]
     /usr/bin/mongod(_ZN5mongo7SMScope7setThisEPKNS_7BSONObjE+0x6a) [0x61d4ca]
     /usr/bin/mongod(_ZN5mongo7Matcher7matchesERKNS_7BSONObjEPNS_12MatchDetailsE+0xeec) [0x5ec7dc]
     /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcher7matchesERKNS_7BSONObjERKNS_7DiskLocEPNS_12MatchDetailsEb+0xec) [0x73dd3c]
     /usr/bin/mongod(_ZN5mongo11UserQueryOp4nextEv+0x2ae) [0x65b06e]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE+0x56) [0x78d476]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x861) [0x795f91]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x799715]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
     /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xafb) [0x64543b]
     /usr/bin/mongod() [0x752225]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x757938]
     /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
     /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
     /lib/libpthread.so.0(+0x69ca) [0x7f93925b99ca]
     /lib/libc.so.6(clone+0x6d) [0x7f9391b6870d]
     Fri May 20 12:29:56 [conn58] assertion 13072 JS_NewObject failed: toJSObject2 ns:foswiki__Mangroves.current query:{ orderby: { _topic: 1 }, query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "MangroveAndSaltmarshSpecies" } }
     Fri May 20 12:29:56 [conn58] query foswiki__Mangroves.current exception  781ms
     Fri May 20 12:30:59 [conn48] JS Error: out of memory
     Fri May 20 12:30:59 [conn48] compile failed for: function(dateString) {
        if (dateString == undefined) {
            return dateString;
        }
    
        var parseTime = function(date, defaultLocal) {
     ...
        return parsedDate.getTime(); //return seconds since 1970
     }
     Fri May 20 12:30:59 Invalid access at address: 0
    
     Fri May 20 12:30:59 Got signal: 11 (Segmentation fault).
    
     Fri May 20 12:31:00 Backtrace:
     0x8a2e59 0x8a3430 0x7f93925c28f0 0x90d770 0x62178d 0x622d96 0x604724 0x60540e 0x5e346b 0x73d993 0x7902f6 0x795a63 0x799715 0x79bc54 0x79c7f7 0x643578 0x7f169d 0x7db210 0x7dc741 0x642c25 
     /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x8a2e59]
     /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x220) [0x8a3430]
     /lib/libpthread.so.0(+0xf8f0) [0x7f93925c28f0]
     /usr/bin/mongod(JS_GetFunctionObject+0xc) [0x90d770]
     /usr/bin/mongod(_ZN5mongo9Convertor5tovalERKNS_11BSONElementE+0xb2d) [0x62178d]
     /usr/bin/mongod(_ZN5mongo7SMScope10setElementEPKcRKNS_11BSONElementE+0x46) [0x622d96]
     /usr/bin/mongod(_ZN5mongo5Scope10loadStoredEb+0x3a4) [0x604724]
     /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0xee) [0x60540e]
     /usr/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e346b]
     /usr/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73d993]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x7902f6]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x333) [0x795a63]
     /usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x799715]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79bc54]
     /usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79c7f7]
     /usr/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3b8) [0x643578]
     /usr/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xad) [0x7f169d]
     /usr/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x9a0) [0x7db210]
     /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dc741]
     /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x642c25]
    
     Fri May 20 12:31:00 dbexit: 
     Fri May 20 12:31:00 [conn48] shutdown: going to close listening sockets...
     Fri May 20 12:31:00 [conn48] closing listening socket: 5
     Fri May 20 12:31:00 [conn48] closing listening socket: 6
     Fri May 20 12:31:00 [conn48] closing listening socket: 7
     Fri May 20 12:31:00 [conn48] closing listening socket: 8
     Fri May 20 12:31:00 [conn48] removing socket file: /tmp/mongodb-27017.sock
     Fri May 20 12:31:00 [conn14] Assertion: 13073:shutting down
     0x55ece9 0x751518 0x7587bb 0x8a3b3e 0x8b6a40 0x7f93925b99ca 0x7f9391b6870d 
     /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
     /usr/bin/mongod(_ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE+0x7f8) [0x751518]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x143b) [0x7587bb]
     /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
     /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
     /lib/libpthread.so.0(+0x69ca) [0x7f93925b99ca]
     /lib/libc.so.6(clone+0x6d) [0x7f9391b6870d]
     Fri May 20 12:31:00 [conn14] AssertionException in connThread, closing client connection
     Fri May 20 12:31:00 [conn14]  shutting down
     Fri May 20 12:31:00 [conn13] Assertion: 13073:shutting down
     0x55ece9 0x751518 0x7587bb 0x8a3b3e 0x8b6a40 0x7f93925b99ca 0x7f9391b6870d 
     /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55ece9]
     /usr/bin/mongod(_ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE+0x7f8) [0x751518]
     /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x143b) [0x7587bb]
     /usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x8a3b3e]
     /usr/bin/mongod(thread_proxy+0x80) [0x8b6a40]
     /lib/libpthread.so.0(+0x69ca) [0x7f93925b99ca]
     /lib/libc.so.6(clone+0x6d) [0x7f9391b6870d]
     Fri May 20 12:31:00 [conn13] AssertionException in connThread, closing client connection
     Fri May 20 12:31:00 [conn13]  shutting down
     Fri May 20 12:31:00 [conn48] removing socket file: /tmp/mongodb-28017.sock
     Fri May 20 12:31:00 [conn48] shutdown: going to flush diaglog...
     Fri May 20 12:31:00 [conn48] shutdown: going to close sockets...
     Fri May 20 12:31:00 [conn48] shutdown: waiting for fs preallocator...
     Fri May 20 12:31:00 [conn48] shutdown: closing all files...
     Fri May 20 12:31:00 closeAllFiles() finished
     Fri May 20 12:31:00 [conn48] shutdown: removing fs lock...
     Fri May 20 12:31:00 dbexit: really exiting now
     Fri May 20 12:31:00 ERROR: Client::~Client _context should be null but is not; client:conn
     
  • Foswiki Final error before the segfault (many other queries, including ref queries failing with JS_NewObject failed up to now):
    | 2011-05-20T02:29:56Z debug | SEARCH crash: System.WidgetsTrinSkin: SEARCH{"parent.name='MangroveAndSaltmarshSpecies'"...
     Caused the following error:
     query error: JS_NewObject failed: toJSObject2
     -------
     |

I'm now logging with -vvvvv as originally requested by Scott (hope my disks don't fill up this weekend...)

-- PaulHarvey - 20 May 2011

Dropped everything, trying mongod 1.8.2-rc1, re-imported.

-- PaulHarvey - 21 May 2011

Looking less crashy so far... we'll see after a few days. I am seeing a query that doesn't seem able to be hoisted, moved to Item10670.

-- PaulHarvey - 23 May 2011

Crash 5

Today, there were no errors anywhere, until 12:25:42 in the mongod PRIMARY:
Mon May 23 12:25:42 [conn12] query: admin.$cmd{ replSetHeartbeat: "seta", v: 3, pv: 1, checkEmpty: false, from: "mongo3.example.org:27017" }
Mon May 23 12:25:42 [conn12] run command admin.$cmd { replSetHeartbeat: "seta", v: 3, pv: 1, checkEmpty: false, from: "mongo3.example.org:27017" }
Mon May 23 12:25:42 [conn12] query admin.$cmd ntoreturn:1 command: { replSetHeartbeat: "seta", v: 3, pv: 1, checkEmpty: false, from: "mongo3.example.org:27017" } reslen:132 0ms
Mon May 23 12:25:42 [conn397] JS Error: out of memory
Mon May 23 12:25:42 [conn397] Assertion: 10433:js init failed
Backtrace:
0x55f299 0x61de41 0x611f53 0x607a07 0x606609 0x5e459b 0x73f0d3 0x791956 0x797213 0x79adc8 0x79e7d4 0x79f3d7 0x645231 0x7e00c8 0x7dc2d4 0x7dd381 0x647015 0x64a5ae 0x753965 0x759078 
 /usr/local/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55f299]
 /usr/local/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x231) [0x61de41]
 /usr/local/bin/mongod(_ZN5mongo8SMEngine11createScopeEv+0x23) [0x611f53]
 /usr/local/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x17) [0x607a07]
 /usr/local/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x259) [0x606609]
 /usr/local/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e459b]
 /usr/local/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73f0d3]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x791956]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x363) [0x797213]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x58) [0x79adc8]
 /usr/local/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79e7d4]
 /usr/local/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79f3d7]
 /usr/local/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3c1) [0x645231]
 /usr/local/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xa8) [0x7e00c8]
 /usr/local/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x724) [0x7dc2d4]
 /usr/local/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dd381]
 /usr/local/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x647015]
 /usr/local/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x324e) [0x64a5ae]
 /usr/local/bin/mongod() [0x753965]
 /usr/local/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x759078]

But this didn't seem to trigger any error log in our application, and at that time the app was serving a fairly boring view request. The JS Error: out of memory continue sporadically for these heartbeat and getmore local.oplog requests (see log).

A few seconds later, we see our application try to serve its next request, and fails:

Time Log Message
2011-05-23T02:25:56Z apache access "GET /TaxonProfile/Definitions/FoA_Title HTTP/1.1" 200 10112
2011-05-23T02:25:56Z Foswiki events.log info, view, FoA_Title
2011-05-23T02:25:58Z Foswiki debug.log SEARCH crash: TaxonProfile/Definitions.FoA_Title: SEARCH{"parent.name='FoA' "... Caused the following error: query error: JS_NewObject failed for global
2011-05-23T12:25:56+10 mongod PRIMARY -vvvvv JS Error: out of memory ... Assertion: 10432:JS_NewObject failed for global

The full mongod error snippet is here:
Mon May 23 12:25:56 [conn407] query: foswiki__TaxonProfile__Definitions.current{ _history: { $exists: 0 }, _web: "TaxonProfile/Definitions", _topic: "FoA" }
Mon May 23 12:25:56 [conn407]   running multiple plans
Mon May 23 12:25:56 [conn407]    used cursor: 0x7fb65c852740
Mon May 23 12:25:56 [conn407] query foswiki__TaxonProfile__Definitions.current ntoreturn:1 reslen:6287 nreturned:1 0ms
Mon May 23 12:25:56 [conn407] query: foswiki__TaxonProfile__Definitions.$cmd{ count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "FoA" } }
Mon May 23 12:25:56 [conn407] run command foswiki__TaxonProfile__Definitions.$cmd { count: "current", query: { _history: { $exists: 0 }, $where: " (1) ", TOPICPARENT.name: "FoA" } }
Mon May 23 12:25:56 [conn397] JS Error: out of memory
Mon May 23 12:25:56 [conn397] Assertion: 10432:JS_NewObject failed for global
Stacktrace:
0x55f299 0x61de57 0x611f53 0x607a07 0x606609 0x5e459b 0x73f0d3 0x791956 0x797213 0x79adc8 0x79e7d4 0x79f3d7 0x645231 0x7e00c8 0x7dc2d4 0x7dd381 0x647015 0x64a5ae 0x753965 0x759078 
 /usr/local/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x55f299]
 /usr/local/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x247) [0x61de57]
 /usr/local/bin/mongod(_ZN5mongo8SMEngine11createScopeEv+0x23) [0x611f53]
 /usr/local/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x17) [0x607a07]
 /usr/local/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x259) [0x606609]
 /usr/local/bin/mongod(_ZN5mongo7MatcherC1ERKNS_7BSONObjEb+0x4fb) [0x5e459b]
 /usr/local/bin/mongod(_ZN5mongo19CoveredIndexMatcherC1ERKNS_7BSONObjES3_b+0x43) [0x73f0d3]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6initOpERNS_7QueryOpE+0x1c6) [0x791956]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x363) [0x797213]
 /usr/local/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x58) [0x79adc8]
 /usr/local/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x79e7d4]
 /usr/local/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x79f3d7]
 /usr/local/bin/mongod(_ZN5mongo8runCountEPKcRKNS_7BSONObjERSs+0x3c1) [0x645231]
 /usr/local/bin/mongod(_ZN5mongo8CmdCount3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xa8) [0x7e00c8]
 /usr/local/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x724) [0x7dc2d4]
 /usr/local/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x831) [0x7dd381]
 /usr/local/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x647015]
 /usr/local/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x324e) [0x64a5ae]
 /usr/local/bin/mongod() [0x753965]
 /usr/local/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5b8) [0x759078]

And then a bit over an hour later, by 2011-05-23T13:43:23+10, mongod has died, segfault, failing to compile some JS (actual JS it fails to compile is visible in the seta.log.022c - this code normally does compile & work).

The logs:
  • seta.log.022b: first sign of trouble... filtered with grep -v ^checking seta.log.022 > seta.log.022c
  • seta.log.022c: segfault & shutdown... filtered with grep -v ^checking seta.log.022 > seta.log.022c

-- PaulHarvey - 23 May 2011

So, I created https://jira.mongodb.org/browse/SERVER-3131

Sven, could you please review the following MongoDB jira bugs and see if they are relevant to us:

-- PaulHarvey - 23 May 2011

mongod --version
db version v1.9.1-pre-, pdfile version 4.5
Tue May 24 08:11:51 git version: ff53662a8f44862ca9430a7a6e7e83904e9d7b1d

-- PaulHarvey - 23 May 2011

This is now waiting for me, to reproduce on our production site (the only place we can trigger the problem!) which involves ceasing daily mongod restarts and then if that doesn't trigger the fault which has an MTBF ~2 business days, shrinking the JS env down from 128M until the problem is back. In fact maybe I will just revert to the latest official mongod. At which point I still don't know what to do: 10gen says they have a memory leak, or that we have a memory leak in our JS code. Either way we need a simple reproducible test case which makes monogdb die.

-- PaulHarvey - 01 Jun 2011

This is less urgent now, more than a week between restarts if you use mongodb 1.9

-- PaulHarvey - 07 Jul 2011

We've not been restarting mongod daily for quite a few weeks now, and we haven't seen this issue again with the MongoDB 1.9.x or 2.0-rcX series. I think this task has run its course. Closing.

-- PaulHarvey - 05 Sep 2011
 
I Attachment Action Size Date Who Comment
seta.loglog seta.log manage 1 MB 23 Apr 2011 - 00:10 PaulHarvey  
seta.log.022b022b seta.log.022b manage 268 K 23 May 2011 - 06:14 PaulHarvey Filtered with =grep -v ^checking seta.log.022 > seta.log.022b=
seta.log.022c022c seta.log.022c manage 15 K 23 May 2011 - 06:39 PaulHarvey segfault bit... filtered with grep -v ^checking seta.log.022 > seta.log.022c
Topic revision: r21 - 05 Sep 2011, PaulHarvey
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy