Institutional Knowledge

Wherein we write down some stuff that we know.

Institutional Knowledge header image 1

Various Data Points

August 15th, 2008 · No Comments

The end of year reports are being written and various folks have come to WEBD looking for what really happened on various systems. These numbers are based on our fiscal year which is from July 1st to June 30th.

University Home Page

  • 2007/08 - 9,289,385 visits as defined by Google Analytics

Yearly Unique Portal Logins (based on daily logs)

  • 2005/6 - 1,905,225
  • 2006/7 - 2,215,433
  • 2007/8 - 2,415,444

Yearly Total Portal Logins (based on daily logs)

  • 2005/6 - 3,377,752
  • 2006/7 - 4,089,277
  • 2007/8 - 4,359,887

Random

These numbers are based on whatever date range we happen to have numbers for.

  • We have 23 Spaces and 372 users in Confluence, our enterprise wiki.
  • 1,492 JIRA issues were created in the 2007/8 fiscal year.
  • 32,915 visits on January 29th, 2008 was our high water mark for the Portal
  • Portal browser stats (9/21/2007 - 6/30/2008) IE 60.47%, Firefox 29.53%, Safari 9.6%
  • Home page browser stats 2007/8 fiscal year - IE 76.18%, Firefox 23.47%, Safari 8.7%

Curious about anything else?

→ No CommentsTags: Misc.

San Marcos Redesign

August 12th, 2008 · No Comments

Congrats to San Marcos on launching their new web site, powered by Cascade Server no less. Thanks to my automated system for grabbing screen shots I can get a historical perspective. Here is a QuickTime export of the shots:

CSU San Marcos Web Site (.mov)

→ No CommentsTags: Web Design

Tracking the Applications You Develop

July 29th, 2008 · 2 Comments

Problem

There are a ton of applications being developed locally at your institution and you have no way of knowing who is doing what or how they are doing it. You don’t know what data they are pulling, where they are pulling it from, what they are gathering and where they are storing it all. You don’t know if you’re re-inventing a wheel that another department already developed. Worse, you don’t even really know what defines an “application.” Even if you did know all of those things, how would you keep that knowledge current? You, are in the dark.

Welcome. Unfortunately this isn’t a typical IK post where we go into detail on how to deal with a technical issue or show you pretty stat graphs. No, this is where I’m simply going to outline the issues associated with this problem and hope that somebody has already gone down this path…and that it didn’t lead to madness.

Defining an Application

Is a script an application? What about a script that takes form input and sends it to an e-mail address with no database involved at all? How small do you go? Before you start you need to agree on what makes an application. This, could take a while.

Finding Applications and Developers

Once you’ve establish what constitutes an application, now you have to find them and more importantly, the people that made them. It’s a typical problem and luckily one that has been solved before. You need to start as high as you can and start working your way down until you discover the technical people. You probably already know a good portion of these folks, but there are undoubtedly people you don’t know about that are developing and maintaining applications somewhere on campus. If you’re going to be thorough, you’ll need to find them.

Tracking

If you’ve tried to keep track of anything on campus you’ve most likely discovered that unless it’s fully automated, you have a problem. Scratch that. You have problems. Plural. You can ask people for updated information over e-mail, which will get you a few well-meaning responses and a lot of crickets. You can send out a spreadsheet for people to updated which will leave you in ‘multiple file revision’ Hell. You might even be so bold as to develop your own application that will allow people to update their own information online. You might even have asked at some point, “How hard can it be?” The problem is that no matter how easy you make this, you are essentially asking other people to do something for you, and therein lies the rub. How can you ask people to do this for you? This is more a people problem than a technology problem, but it’s a problem nonetheless.

Permissions

For the sake of argument, lets say that we’ve solved the people problem of getting updated information with the application we developed. Now, who gets to see what once they’ve gotten inside. If it is locked down so that people can only see their own information, you’ve created a system that requires them to do work with no real value to them. There is no real benefit in a person only seeing their own information. They’ll most likely want to have access to all the information so they can see what others are doing as well. That was the whole point of the application in the first place, right? So, you open the system and have no restrictions on who can see what. That idea is scary to a lot of people and more than likely they have valid reasons for being wary of this venture. Again, we’ve run into a people problem.

Summary

We have a blind spot on campus and we want to use technology to help illuminate ourselves to what is happening. The technology solution requires that campus agree on a definition, dedicating time to keep the information current, and who has access. These are things that must be dealt with by people. These are issues which no matter how much technology you throw at them, they will not go away.

Solutions?

Have you tackled this dragon? Did you win? If so, we would love to hear your story. Actually, we would love to hear your story even if you lost.

→ 2 CommentsTags: Project Management · Web Development

Portal Stats: The Term in Review

June 13th, 2008 · 1 Comment

At the start of the term, we looked at portal traffic on the first day of the term, in this case spring 2008. It was a busy day, to say the least. Now that we have data for the entire term, what more do we know?

  • The second day of school was the busiest day of the term for visits (not visitors, as define by Google Analytics), 32,915.
  • Monday after spring break had the highest total visitors, 18,391.
  • Saturday, March 15, the start of spring break was our least busy day for visits, 4,499.

Now, you might look at these numbers and say, “Pat, that’s exactly what we would expect to happen.” That is correct, but until you actually have the numbers, you don’t really know that your expectations of reality and reality itself match up. Now you do.

The numbers for the whole term look like this:

Google Analytics spring 2008

You might notice some abnormalities in our sparkline graphs. This is due to our introduction of tracking some popular “external links” during spring break. It will slightly distort our page view data for spring 2008, external links are tracked as views, but as long as we break our reporting into pre and post break segments we’ll be fine. Our visitor information remains consistent though.

The external link tracking is important in the portal so we know where people are going when they leave. Our portal strategy has never been to bring applications into the portal, just provide easy access. Before Google Analytics it was difficult for us to track that information. Now we track clicks to all major applications from the portal. That information should prove illuminating in the future.

Overall having Google Analytics in the portal is a big win for us. We’re still keeping all the raw log file information and doing our usual processing, but this wins hands down just for the amazing breadth of information we can look at now.

→ 1 CommentTags: Portal

Tomcat SSL Performance Followup

April 2nd, 2008 · 1 Comment

Previously I talked about improving the SSL performance in Tomcat simply by upgrading the JVM. Here we have a somewhat not-to-scale chart showing how we did in a “real world” test. Last night at 6pm an application opened up and sent a flood of users to our authentication service (CAS). Last year we could not handle the flood. CAS stalled, which caused a flood of calls to the help desk. In the business, we call that less than optimal.

We’re rolling CAS on Java 6 now and SSL performance is no longer an issue.

→ 1 CommentTags: Web Development

The Arrival of Multi-Search

March 25th, 2008 · No Comments

Meriam Library has announced the arrival of Multi-Search. This is an exciting new way to search the catalog as well as several online databases that we have access to like JSTOR and Academic Search. In the past students would have to search each journal database and our catalog separately. Now, everything is handled through a single search box. Hopefully users will find this means of search much more convenient and a faster way of researching.

→ No CommentsTags: Information Design · Search

CAS Strikes Again

March 24th, 2008 · No Comments

In the last two weeks, WEBD has upgraded two of our big services: Confluence and JIRA. As apart of these upgrades we turned over authentication to CAS, our single-sign-on service. Now users will be able to jump back and forth between Confluence and JIRA without having to login a second time. Confluence now becomes a great place to keep documentation and support for web applications that also employ CAS authentication.

→ No CommentsTags: Authentication

Automating Website Screen Captures on OS X

March 24th, 2008 · No Comments

We have 23 campuses in the California State University system and I like to know what all of them are doing on their web sites. Sure, I could dig into the Way Back Machine, but I find that can give inconsistent results. Instead, I decided that I wanted to automate getting screenshots of all the CSU homepages. This requires some shell scripting, so I’m going to make a lot of assumptions about technical knowledge here. If you aren’t comfortable mucking around on the command line, this may not be the best solution for you.

1. Install webkit2png

If you are on 10.5, you won’t have to install anything but this script.

2. Script Your Shots

I’ve got a small shell script with an array of 23 URLs and a for loop that passes each one to webkit2png.

SITES=( http://www.csub.edu/
...
http://www.csustan.edu/ )

SITENUM="${#SITES[*]}"

for ((i=0;i< $SITENUM;i++)); do
   webkit2png -F -W 1024 -d -D /Users/pberry/Projects/csushots/ ${SITES[${i}]}
done

</pre>

3. Profit!

From there, you could just cron the script up and let it run. The -d option will put the date in the filename, so you aren’t blowing away your archive of screen shots with each run. I decided it would be nice if I could organize these in iPhoto and I didn’t want to manually import my shots each time my script ran. I’m no good with AppleScript, so I turned to Automator.

I created a workflow that creates the directory, runs the shell script, imports them into iPhoto (iPhoto is set to copy on import) and then deletes the directory. The extra steps were really just to make the workflow happen. I’m guessing there is a way to do it without creating/deleting, but I was in a hurry. I then created a Smart Album based on the unique parts of the file names (by default webkit2png will use the URL for the filename) for each campus. Using iPhoto also lets me do “fun” stuff, like create movies showing how a page changes over time (QuickTime).

Now I’m able to spot trends and changes in our system. I could extent the process to grab other pages as well, but for now I’m okay with just the main page for each campus.

Caveats and Misc.

The webkit2png script can have trouble with Flash, so screenshot will look “unfinished” if they use a big Flash movie to show photos. /looks at Bakersfield

If you don’t have 10.5, you’ll have to install the PyObjC bridge. It sounds nasty, but it’s easy. Double-click and you’re on your way.

You could do the first two steps on linux with khtml2png.

→ No CommentsTags: Misc.

Tomcat SSL Performance

March 13th, 2008 · 2 Comments

We run a number of applications in Tomcat (both 5.0.x and 5.5.x) and for the most part we’re very happy with the performance we get. There is one time of the year where our CAS (Central Authentication Service) gets killed though, and it’s because of SSL connections. Let me elaborate, it’s because of Tomcat 5.0.x running under JDK 1.4.x. One application for one hour out of the year floods CAS with so many requests that it can’t keep up due to the overhead of SSL. JDK 1.4 just can’t deal with SSL very well, or rather, very quickly. The threads fill up and start blocking connections. In the business we call that LTO (Less Than Optimal).

Now, there are many technical solutions (Tomcat has native APR libraries, we could front with Apache httpd or we have hardware that can do SSL but the latter has security issues) which we never deployed because for 1 hour out of the 8,760 hours in a year we do just fine with the existing setup. Yes, I understand that’s only 99.988% uptime, but still it’s pretty good.

Now, you’re probably thinking to yourself “Where the heck have these guys been? Java 5 gives you a huge performance boost and Java 6 just adds to the gains provided by 5!” We’ve been deploying Java 5 on upgrades and new applications. We just never got to CAS and honestly there was no real need because CAS is so simple and so solid, you rarely think about it once it’s running.

Give me numbers, Mrs. Landingham!

I fired up httperf and grabbed some numbers.

JDK 1.4.2_06

Total: connections 2000 requests 2000 replies 2000 test-duration 83.398 s

Connection rate: 24.0 conn/s (41.7 ms/conn, < =311 concurrent connections)
Connection time [ms]: min 449.9 avg 6122.5 max 47219.6 median 3891.5 stddev 6211.0
Connection time [ms]: connect 6011.3
Connection length [replies/conn]: 1.000

Request rate: 24.0 req/s (41.7 ms/req)

JDK 1.5.0_15

Total: connections 2000 requests 2000 replies 2000 test-duration 57.203 s

Connection rate: 35.0 conn/s (28.6 ms/conn, < =26 concurrent connections)
Connection time [ms]: min 79.7 avg 255.5 max 3421.0 median 163.5 stddev 230.4
Connection time [ms]: connect 225.6
Connection length [replies/conn]: 1.000

Request rate: 35.0 req/s (28.6 ms/req)
That’s roughly a 28% increase in the time to process a request. Now, we all know that there are lies, damn lies, and statistics. This is by no means an exhaustive breakdown of the differences between SSL performance between these two JVMs. This is simply a small bit of empirical data. That being said, it’s probably the cheapest and easiest performance gain your ever likely to get.

→ 2 CommentsTags: Authentication · Recent Projects

Rands Personality Test

February 21st, 2008 · 1 Comment

We’ve all taken the Myers-Briggs Personality Test. Answer a series of questions and you arrive at a 4-letter description of yourself based on four dichotomies: attitudes, functions, and lifestyle. For example, back in my freshman year in college, I was an INTP or Introversion-Intuition-Thinking-Perceiving.

Based on Rand’s Managing Human book, I’m proposing the idea of a Rands Personality Test based on the following dichotomies of management styles.




Dichotomies
Inward Holistic
Incrementalist Completionist
Mechanic* Organic*
Or combinations of the two: Organic/mechanic and Mechanic/organic
From these dichotomies, at this point in my career I would consider myself to be a self-described ICOm or Inward-Completionist-Organic/mechanic.

For those of you who have read Managing Humans, what’s your Rand’s Personality Type?

→ 1 CommentTags: Project Management