The random rantings of a concerned programmer.

Archive for December, 2008

see if you can spot the bug

December 30th, 2008 | Category: Random
post_ids <- case r of 
  Nothing -> (do
    -- actually have to run the query here
    let q = "SELECT post_id FROM s_posts p WHERE " ++ q_where ++ " ORDER BY p.post_date DESC"
      where q_where = intercalate " AND " post_clauses

    stm <- prepare db q
    execute stm post_params
    rs <- fetchAllRows' stm

    let post_ids = map (\x -> fromSql (x !! 0) :: Int) rs

    -- cache the results
    let q = "INSERT INTO " ++ cache_table ++ "( query_string, query_result ) VALUES( ?, ? )"
    stm <- prepare db q
    execute stm [toSql q_serialized, toSql (encode post_ids)]

    return post_ids)

This block of code runs right after checking to see if the fulltext search results are cached. Specifically, this bit gets run if the search isn't currently cached -- it performs the fulltext search and throws the results back into the database as a JSON-encoded list of matching post IDs.

The kicker here is the fetchAllRows'. This function differs from fetchAllRows in that it is strictly evaluated -- which means if you've got a lot of search results, it'll pull them in all at once and leave you sitting with a massive array. Instead, we'd like to collect the results and throw them to the DB as we're fetching them. The solution is to switch to fetchAllRows, making sure to use a separate stm (because the subsequent execute on the same statement will dick stuff up).

I won't bother posting the fixed code, it's a fairly trivial change.

This also explains why I was never able to reproduce the problem by re-executing requests from the access logs -- this code only gets triggered the first time the search hits the database. Once it was run, the results got cached and this bit of code wouldn't need to execute again.

Sneaky motherfuckers!

Comments are off for this post


December 30th, 2008 | Category: Random

This shit sucks. I finally got around to migrating to the new version of 4scrape — the one in Haskell that used Lighttpd and PostgreSQL instead of Apache and MySQL. Spent all afternoon migrating the scraper and the database itself and got it all “working” this evening.

And then I notice that, very rarely, a FCGI process will just start spinning in an infinite loop, consuming 100% CPU. I try fucking everything I can to reproduce it, but I can’t. So then I take the Lighttpd access log from the entire day and write a short Python script to open every one of those fucking URLs until something breaks. And the fucking script made it to the end of the list without breaking anything.

As matt pointed out, it’s a Heisenberg bug. What’s more — it’s a fucking showstopper. Sure, I can wrap the code with a nice STM timeout so it kills itself when it takes too long, but that’s just nasty. And there are other outstanding issues and things to implement and I’m fucking frustrated enough with it right now.

I’m just going to say fuck it and rewrite the mod_python wrapper for the old frontend to use FCGI, start MySQL back up and throw it in with Lighttpd. Because this shit is just bananas — I’m not going to try to debug a problem that I can’t even fucking reproduce. Especially when the language is Haskell, where everything is deterministic. I’ve had enough anux haxing for one day.


Snowcrash is Awesome

December 27th, 2008 | Category: Random

also so is darcs.

fuck you Xarn!

new christmas mittens are warm.



December 26th, 2008 | Category: Random

Bleh, I was dicking around today with the Haskell NCurses bindings, HSCurses, and realized it’s missing some pretty core functionality. For example, I want to split the screen into two separate windows for a chat client — a main screen to display the text, and a smaller one to buffer typed content. With NCurses, this means you’d create two separate windows, and that’s fine and all.

The problem is — to actually get the changes made to those windows to display on the screen, you either have to call wrefresh or wnoutrefresh (which effectively copy the data in the window’s buffer to the main screen buffer). Unfortunately, HSCurses provides no binding for either of these two functions — they only provide refresh (which effectively calls wrefresh on the main window) and doupdate, which you’d use with wnoutrefresh to update the screen). So I needed to add some extra bindings.

Surprisingly, doing so wasn’t that difficult. I grabbed the source with a quick cabal fetch HSCurses, unpacked the archive and made changes to the module. FFI is fairly straightforward in Haskell (though I think the FFI semantics changed in 6.10.1, though I don’t have it installed to test it) –

-- | wrefresh refreshes the specified window, copying the data
-- | from the virtual screen to the physical screen.
wrefresh :: Window -> IO ()
wrefresh w = throwIfErr_ "wrefresh" $ wrefresh_c w

foreign import ccall unsafe "HSCurses.h wrefresh"
        wrefresh_c :: Window -> IO CInt

Anyway, a quick rebuild of the library and the shit now gets drawn to the screen properly. Hooray.

I really should go through HSCurses, add in some functionality it’s missing and add in some actual documentation then submit a big fucking patch to the maintainer. Hur hur. *grabs dick*



December 23rd, 2008 | Category: Random

fuck it I’m bored with this RDF wiki shit and don’t feel like doing any more work on it. if anyone wants to see my terribly unreadable haskell source + javascript + database dump (minus ip addresses) and shit let me know and I’ll upload it somewhere (otherwise cbf’d).

also I accidentally tore a hole in my favorite pair of mittens :(


Next Page »