Archive for the ‘From The Lines’ Category

Bad Data Design

Code Follies, From The Lines | Posted by attriel October 17th, 2008

Today I was working on some changes that had been requested to a system we manage.  Technically, they’d been requested through another group, passed to a third group, and they’d talked to me about it a month or so ago, I explained how it needed to process to update into our system so that the person asking could actually see the result, but we decided that, because it’s a legacy process that translates into our database, they’d get the system figured out and then we’d figure out how to pass the updates piecemeal.  Since until the flags are set for each record, they can update the source all they want and nothing happens.

Today I got a forward of a forward of the tail end of a conversation between two people that had been passed to a third.  Yeah, does that sound confusing?  me too!  End result being that the guy that requested the original change was wondering why he’d been told it was done but it wasn’t showing up.

Somewhere along the way, the flags were never updated.  Oops.  Turns out, upon further inspection, that they updated table A, we process from view B, which gets that data from C.  C has no relation to A.  So even if they’d updated the flags, we never would have gotten anything :o  So after some discussion I processed their updates directly into our dev environment so we can make sure that they do what everyone wants they should do.  I don’t actually know if the source data will ever get updated :o

On an entirely unrelated note …. What do you call a design that has two unique keys for every entry?

Now … what do you call it when it turns out one of them is not garaunteed to be unique?

OK … And what about when the other one is garaunteed to be unique EXCEPT WHEN IT ISN’T???

*sigh* Literally, I found a set of records where A and B shared key 1, then B and C shared key 2.  There is no unique way of identifying these without using (a) both keys, or (b) information that exists only inside the database

Of course, I will never get an explanation as to who’s dumb idea this design was, because that person is probably one of the guys that’s gone by now.  Just like all the other design idiocy decisions that vex me (us) daily.

Password, Password, Who’s Got The Password?

From The Lines | Posted by attriel September 30th, 2008

So, the other day I was asked to run a security scan of an app as part of my supplemental duties as a non-attached member of a third party group.  Sure no problem.  Well, actually, I haven’t used the tool since I got the new system at the office, lemme make sure I can still access it.

Yep, there’s the prompt for a login.  <look up my password information> … wait, no?  Crap.  I thought that was what I changed the password to.  Oh, right, they changed the username to make it more obvious what the account was for … talk to the group lead, he doesn’t remember either.  He hasn’t been on in longer than me.

He takes me over to the guys who admin the server.  The guy we need to talk to is out, as is the second best bet.  So I ask person C, who I know is covering some of A’s projects while he’s out.  But they don’t know.  They suggest D.  We (three of us now, me, boss, and C) traipse over to D’s office.  Some discussion, C admits that E would be a better choice, but E is talking to their boss … So D brings out his security notebook, and starts paging through it.  ”You’d be amazed at how much is in this book.  I’ve got every password on every system.”  ”Except the one we want?” “Well, you might be right.  I don’t see anything for that server.  Maybe see if E is done talking to Boss2?” … No.  But I talk to C for a bit on something else, Boss1 keeps talking to D on some subject, then is heading back to his office when Boss2 finishes, so I call Boss1 back to talk to E at least long enough to provide validation for my request for the password to an account who’s name I don’t know …  Yeah, THAT’s not fishy at ALL.

Finally get the account name (it was pretty self evident), and got the password reset since E didn’t have it noted anywhere either (They may not have it written down, since they can always just reset it).  FINALLY!

Elapsed time: 1hr.

Went back, started configuring the scan tool (Piece of …) (that took about 2 hours, but I pushed it off some to do some other tasks vaguely related to my normal job).  At some point in the afternoon, Boss3, who owns the app I’m scanning, sends me an email with “test account information”, the URL and the username I need to login.  Great.  It’s a dummy test login for the QA server, I don’t think anything of there not being a password in the email (it’s qa, so I wouldn’t have winced too bad at getting “Password: abcd1234″ in the email with the username “Username: qatestacct1″ …)

Wrong.  Needs a password.  Well hell, go over to Boss3′s office, but he’s left for the day.  Call Boss1, he says “oh yeah, I have the password” and gives it to me once I grab a pen.  Turns out my pen doesn’t work, but it scratches enough to let me read it 5 seconds later to type it in.  And it doesn’t work.  Well, he said it wasn’t case sensitive, but we’ll try it the cases.  He said “oh” and i took that as the letter, but maybe he meant the number?  number and cases?  hrrm.

Go find Boss4, who replaced Boss1 in his other duties when he moved to a new position, and is theoretically (I believe) above Boss3.  Maybe they’ll know who to ask, anyway.  They send me to G or H, and takes me to G’s desk.  G doesn’t know, he takes me to H.  H isn’t there, but his stuff is so he’s not gone yet just out for a drink or maybe he’s looking for a password somewhere … Anyway, go back to my desk, try a few other things and some regular work, get up a few minutes later and go check and H is there, but he’s putting on his walking shoes (vs work shoes), so he’s obviously getting ready to leave.  Damn good timing on my part.  Some back and forth with me, G and H, H gets what I’m looking for and writes me down the password (case insensitive still).  Looks only vaguely like what Boss1 gave me earlier.  As in it shares some characters, but not most of them.  But it works!  I can now log in to the app, so I can now run the scan.

Elapsed time, Password 2: 45 minutes.

*sigh*

And the best part?  I’m fairly sure the scan is basically going to give me meaningless gibberish, because the “approved scanning tool” sucks.

Oh, and by the time I launched the scan?  Boss 1 was gone.  It was his last day, he’s off to a new project at some other site.  Which means I’m the only person left on the team, and I’m kindof curious why he’s calling me for the results this week …

The Blame Game

From The Lines | Posted by attriel September 9th, 2008

Apparently people don’t like to take credit for their mistakes.  Gee, who knew?  And they like it even LESS when you point them out.  In front of their bosses.  Shocking

Today was a wrapup meeting for a deployment that went horribly wrong in July.  Actually, it went badly, then wrong.  Then badly again in August before finally working.  Today’s meeting was to go over what all went wrong, why it went wrong, and how do avoid it in the future.  ”Lessons Learned” kind of meeting.

Some of the issues were “unpredictable.”  Like the network switch ignoring the system.  Or the network configuration being wonky and requiring a new magic piece that didn’t need to be there before (and had no particularly good explanation for why it is suddenly needed now on multiple various configurations).  And there was some confusion with the security group and whether the system was supposed to be checked or not.  That part sounded like a failure on the security groups planning and distribution end.  To Wit: They have a mailbox for these kinds of requests, that no one on the server group knew about.  And even if they had, the policies are kindof vague.  Like “You ask for A, and we do it (when we can) and don’t tell you the results”.  That was actually the basic policy.  They’re looking into it and thinking about maybe, yaknow, telling you the results.  Instead of making you guess.

The code on the system, that was all configured and functional.  That had been tossed up on various occasions on an internal network for testing and qa.  But the networking had to be physically changed over to a whole new set of hardware to make it public rather than private.  The errors at that point were numerous.  And still not code related.

First error — The server group misconfigured a piece of hardware so it wasn’t coming up properly.  They actually took the bullet on that one, saying they made a typo.  Except that instead of 1234 it said abcd.  ABCD was the configuration from another system.  So, yes, technically she might have miskeyed it, it’s more likely she started with that file and just didn’t fix that line.  Nitpick, since she took the bullet, but it’s still different.  And during the meeting she took the hit after I mentioned that piece.  They kindof glossed over that error, blaming it all on other things.  Including error 2

Second error — Some of the networking and scripting was messed up, didn’t work properly, wasn’t allowing traffic correctly to the services.  Turns out that there were a bunch of lines that they hadn’t understood what they did or how they worked on the old system, so they had just copied some of them over wholesale without changing any of it so that it reflected the new system.  Others they declared to have no effect whatsoever and deleted.  Turns out that the prior admin had set them up for specific reasons, and even had a script to automatically generate those lines for any new server.  But failed to document why or how they worked.  So oops :o  The official explanation is “magic script” that fixed “undocumented problems.”  I’m still not sure if they documented the lines, their function, or the script.  I’m not convinced they know what any of them mean (I don’t, but that’s because I don’t know what they are ; I’m still not convinced they weren’t a red herring tossed up to hide the first error, and after they got called on #1 they couldn’t retract #2)

Error the Third — Because of the way they have things set up (badly and undocumented), it turns out they needed to move a SECOND configuration to the new server.  Because it turns out that the services had two interfaces.  And both were actually necessary!  go figure, who’d've thunk?  After the second “badly” they found this, more by accident than design.  Actually, they were wondering what A meant in the config, looked it up, and it came back with C.  Which didn’t match with the part where a different check (from a different server) came back with B.  When they should have matched.  When I asked about it, I was told offhand “Oh yeah, A has two answers, B and C, depending on where you ask from”  Really?  Did we need to tell the new machine about C, then?   We told it about B, does it need C? “Yeah.  Hey, do you think that could be the problem?  It’s supposed to be listening for B or C”

And someone actually suggested that that COULDN’T be the problem.  Because it worked on the old machine.  DUH. … the old machine which knew both B and C?  That one?  That still knows C?  Yeah … Turns out that it WAS important.  But at today’s meeting?  They kept saying the only problem was a networking issue from outside their control that makes no sense.  Until I asked about the config B/C thing.  And then gave more details.  Then a few more.  And finally just told them exactly what had happened, how it had happened, and how it had gotten fixed.  ”YOU forgot to move the config for C.”  The manager of the server group finally said that they vaguely recalled something from what I was saying, I <em>might</em> be right.  I think that was basically a signal to the other guy to drop it because I wasn’t letting them deny it.

Part IV — THEN there was another problem, with services going out from the new server.  Turns out we forgot to tell group X that we were changing the server host.  So they didn’t update their configs to reflect our new source.  They also had some unrelated problems that affected myriad hosts.  So that was glossed over at the meeting and denied.  Until I pointed out that part of the problem had required them to update their configs to reflect our new server.  At which point, yes, that’s true, but … but what?  That means that there was  <strong>A</strong> problem related to our server move!  Thus it should be part of Lessons Learned!

Oi.  I didn’t make any new friends, I’m sure.  Probably negatively impacted some of the folks I AM friends with in that group.  But jesus, take your own hits.


Speaking of.  Last week I got a project finalized and it went live on a new server.  And I neglected to note that the old server was accessed via secure tunnels.  So I didn’t check the new server that way.  Code all worked, so I approved it.  And then the next morning had to scramble to find out what everyone’s problem was.

During this scramble, I was getting IMs from the server group asking if I’d checked it.  ”Yes”  Did you check it via secure? “Uh, no, I didn’t realize it needed it”  So you didn’t do your testing via the mechanism used to properly access the service?

This set off alarm bells.  We’re setting me up for the fall here.  It’s not THEIR fault, it’s all MY fault.  Which, yeah, I effed up and didn’t check it properly.  But I went and checked my dev environment, because I didn’t remember secure being set up there either.  But it turned out it was.  Great, lemme look … and … yeah, everything works FINE still.  So, yaknow what?  Not really my code being broken here folks.

So I sent a message to all involved, apologized for dropping the ball on the testing, announced that it all worked with the proper mechanisms on dev, so as soon as the server group identified the problem with why secure tunnel wasn’t set up or configured, we’d be good.  

Server group never responded AFAICT.  In a different conversation I was brushed off with “well, you didn’t tell us you needed it so we didn’t bother configuring it properly”.  And I’ll grant that it’s possible that I told them that I didn’t need it and it wasn’t on dev anyway.  but since I also told them to make production look like dev, one would have thought that it being on dev would have given them pause.  But yeah, that one was me :o