So in honor of my decade of blogging, this week will be all about blog stats and such. I figured the best way to start off would be to go through each blogging year and make note of the “highlights” of the past decade. So here you go!
Year 1 (May 2006 – April 2007)
- Started blogging (duh)
- Graduated high school
- Took a cruise to Alaska
- Got my wisdom teeth removed
- Attended band camp for the U of I marching band
- Met two of my best college friends: Matt and Maggie
- Started college
- Met Sean
- Joined Facebook
- The Butt Song made its debut
- A (really crappy) play I wrote was performed in my theatre class
- I took Literature in Western Civilization II and realized that I wanted to study philosophy in more detail
Year 2 (May 2007 – April 2008)
- Went to a drag show, in drag, with Matt
- Started dating Matt at said drag show
- Got hired at my first part-time job: Wendy’s
- I took Tests and Measurements (PSYC 453) and realized that psychometrics was the area of psychology that interested me the most
- The 25-credit semester
- I spent most nights talking to Sean on MSN Messenger, usually until at least 1 AM
- Discovered Leibniz
- First date with Rob and all the subsequent Rob/Jessica drama that entailed
Year 3 (May 2008 – April 2009)
- Worked at the U of I as a summer custodian
- Discovered Metalocalypse
- Moved into the house with Sean, Aaron, Lanky, and Michael
- So much Rock Band
- Got my industrial ear piercing
- Broke up with Rob and dealt with all the drama that entailed
- Got to march a halftime show for the Seattle Seahawks
- Started dating Aaron
- Went to Hawaii with the band
- Went parasailing in Hawaii
- Got my B.S. in psychology
- Got accepted into UBC’s psychology graduate program
- Turned 21
Year 4 (May 2009 – April 2010)
- Got my B.S. in philosophy
- “Broke up” with Aaron (I use quotes because it was the most mutual, amicable break up there could ever be)
- Worked as an in-home caretaker for Seubert’s
- Took another cruise to Alaska
- I…did stuff. This is still private, but it’s worth mentioning because it’s important to me and I think at least one of you knows what I’m talking about
- Moved out of the house with the guys
- Moved to Vancouver
- Started grad school
- Realized my supervisor and I were not the most compatible of people
- Lots and lots of misery
- Lots and lots of rain
- NO REALLY IT RAINED THE WHOLE GODDAMN MONTH OF NOVEMBER I AM NOT EXAGGERATING I KNOW THIS REALLY ISN’T A BIG DEAL IN THE GRAND SCHEME OF THINGS BUT I MEAN SERIOUSLY WHAT IN THE SOGGY FUCK, VANCOUVER
- Won NaNoWriMo 2009
- Started downloading a new song per day
- The 2010 Olympics came to Vancouver. I walked around downtown and got to see the Olympic torch
Year 5 (May 2010 – April 2011)
- Moved to a new apartment in Vancouver
- Went to Boston for the APS conference
- Moved my blog from MySpace to WordPress
- Started walking for pleasure
- Won NaNoWriMo 2010
- Decided not to continue on to the PhD program at UBC
- Got accepted into the philosophy graduate program at UWO
- Went skydiving
- Thesis drama
- Was hospitalized for…reasons
- Ran a 10K (Vancouver Sun Run)
Year 6 (May 2011 – April 2012)
- I got really sick, both mentally and physically
- Was hospitalized again for…different reasons
- Successfully defended my thesis
- Got my M.A. in psychology
- Moved back to Moscow
- Took ANOTHER cruise to Alaska
- Saw Mount Rushmore
- Moved to London, Ontario
- Started grad school (again)
- Quit grad school (again) and moved back to Moscow
- Won NaNoWriMo 2011
- Moved to Marana, AZ to be with my mom
- Moved to Tucson, AZ with my mom
- Started working at Pima Community College as a Disabled Student Resources tech
Year 7 (May 2012 – April 2013)
- Moved back to Moscow
- Started working as a lecturer for the UI stats department
- Went back to undergrad
- Won NaNoWriMo 2012
- Worked as a data analyst for the Ag Department
Year 8 (May 2013 – April 2014)
- Had what was probably my most enjoyable semester at UI
- Walked 1,361.2 miles in 2013
- Got accepted into the University of Calgary’s statistics graduate program
Year 9 (May 2014 – April 2015)
- Got my B.S. in math
- Had to unexpectedly choose between University of Calgary and Carleton University for grad school
- Chose the University of Calgary and moved up there
- Met Nate and subsequently fell head over heels for him
- Won NaNoWriMo 2014
- Won a TA award for fall 2014
Year 10 (May 2015 – April 2016)
- Saw some of the oldest/biggest trees on the planet in the Grove of Titans in Jedediah State Park
- Went to my first MLB game and saw the Giants play the Braves at AT&T Park in San Francisco
- Saw the Grand Canyon for the first time
- Visited Yellowstone National Park
- Moved in with Nate
- Adopted Jazzy Cat
- Nate and I got engaged
- Walked 2,523.29 miles in 2015
- Won a TA award for fall 2015
- PhD program drama
- Blogged for 10 years straight!
Man, a lot of stuff can happen in a decade. Writing it all out like this makes it seem like I’ve really become a completely different person from the person I was when I first started this blog. Pretty snazzy, if you ask me.
So today is kind of a big deal for me. Ten years ago, on May 1, 2006, I gave in to peer pressure and started this blog with the intent of posting every day. Because it was 2006, this blog was started on MySpace and was basically just me rambling about my daily activities.
Since then, however, my blog has grown up a little bit, gotten its own domain name, and has become a little archive of my life since 2006, with a post for every day of every year since the blog was started.
So I figured it’s appropriate to acknowledge that today is my 10 year anniversary of blogging! Yup, believe it or not, there’s been a decade of this nonsense.
This coming week will be all about blog statistics and looking back on the past 10 years to see what’s changed for me (and what’s stayed the same). So that’s something to look forward to!
As always, I appreciate everyone who reads this (subscribers and passers-by), and I hope I’ve kept you entertained enough so that you’ll want to keep reading long into the next decade of my blatherings.
ZOMG guys, my one decade anniversary of blogging is one month away!
So here’s the plan: for the week following my decade post (which will be on a Sunday), I’m going to do some blog stats/analyses. I kind of did this back when I hit six years, but I want to be a little more comprehensive this time. I’ll have a huge spreadsheet of all my 10 years of data (including word count, category, title, number of pictures per post, etc.), so there will be a lot of stuff I can analyze.
And we all know I love doing that kind of stuff.
YAY, I’m excited.
OH CRAP, yesterday was my 3,500th blog.
Totally missed it. A+ in blogging.
Anyway, that means I’m 35% of the way to my goal of 10,000 blog posts (well, 35.01% if you count today’s), which is pretty ridiculous considering I had zero hopes of doing more than 50 posts way back when I started this madness.
(Blame my high school friends for this madness).
Here’s to another 6,500 posts!
Okay, so I know we’re still like 9 months away from it, but I’m already excited about my “10 years of blogging” thing next May. That’s a decade of blogging, dudes. That’s a long time.
Here are a few things I plan on doing for the big anniversary:
- Yearly stats: which years had the most words, which years had the highest/lowest GFI scores, all that stuff.
- Category stats: which categories had the most posts, which had the least, which had the most words, which had the least, etc.
- Title stuff! I want to see how many titles I’ve repeated (and how many times) and do a list of my favorites.
- Tag stuff! When I first started using the tags I was super serious about it, but that’s quickly devolved into “can I fit a stupid joke or two in the tags of this post?”
- Graphs. THERE WILL BE GRAPHS.
And probably a bunch of other things I’ve forgotten to include here. Sorry, I started this post and then got distracted reading about Johann Bernoulli, then came back and forgot what else I was going to add to this list.
Claudia: Master Blogger.
Ahoy, faithful readers! (All two of you)
Today is May 1st, meaning it’s time to celebrate yet another blogging anniversary.
Eigenblogger is now nine years old, which means that I’m entering my tenth year of blogging.
How sad is that?
ANYWAY. I have two main goals for this upcoming tenth year–one completely non-blog-related and the other very blog-related:
- Goal 1: Defend my thesis before May 1st. I’d like to be able to say that I got 5 degrees (6 if you count the high school diploma) in the span of 10 years.
- Goal 2: Actually post these damn blogs more frequently…say, once a week at least. For some reason, there are a few people who read this nonsense. I’d like to stop annoying the hell out of them with my erratic posting habits.
Though really…of those two goals, which do you think is more likely? For me, at least.
I’ll have a super awesome stats-filled party post for my 10 year anniversary, but for now…WOO NINE YEARS!
Hey, look who’s been on WordPress for three years now.
Here are my top countries by viewer count since February 2012. I don’t know if I just have one stalker in a few of these high-ranking countries or if just that many people have looked at Eigenblogger once, thought, “OH GOD, NO!” and moved on.
I vote for the latter.
Sorry, my head hurts.
I was planning on doing my first “Canadian Mall” installment for Calgary today, but I had to wait around for my new fridge and didn’t have enough time to walk before any of the malls closed. So that’ll probably happen tomorrow.
I know this blog probably won’t be posted until like November with my horrible uploading schedule, but I’m assuming that by the time you read this you will have noticed that there have been some design changes to my blog—specifically in the color and font departments.
That’s because I was talking to my mom tonight and debating whether to buy the custom colors/fonts option on WordPress ($30/month) and she ran and got her credit card and got it for me.
My mom is awesome. Now my blog can be even more eye-gougingly colorful!
UGH I’m sick of my blog header already. And my blog colors. I JUST WANT IT TO BE PRETTY.
I like my current theme and don’t want to change it, so I’m pretty tempted to pay the extra yearly charge for total color customization.
Problem: I have zero extra money right now.
Solution: Just deal with the current color scheme until I get some dollars.
(Sorry I’m boring)
New header, fools!
I liked the old one, but I just needed a change. Sometimes that happens.
Not sure if I like this one very much now that I’ve got it up, though. Might change it around in a bit.
It’s Claudia again, coming to you live from blog post #3,000!
Yup, I’m 30% of the way to my goal of 10,000 consecutive blog posts.
Of course, given how much I suck at uploading these on time, you probably won’t see this until post #4,000, but it’s A COOL MILESTONE ANYWAY.
I feel bad for not doing anything for my 8 year anniversary. So here’s some crap:
- Total number of words over the past year (May to May): 99,463
- Average number of words per sentence: 12.09 (in general, I write like 27-word sentences, but I think they average out with all my one-word survey responses and my filler text such as “YO!” and “Anyway.” and “Weird.”
- Gunning Fog Index: 8.25 (so you need about 8.25 years of formal education to understand my blathering, apparently)
I don’t know why in the hell I started this blog in May. I guess I wasn’t thinking that this would always be the busiest month of the year.
Anyway. Thanks again for reading, all you loyal followers! And thanks for all your comments, Matt!
That’s right, I’ve been blogging for eight years now.
That’s 2,923 days (counting today).
I was hoping to do something cool for today, but I’m obscenely busy and didn’t manage to plan anything.
But there will be later anniversaries, I’m sure. Number 10 will be big, I promise!
Thanks for reading, y’all, I appreciate it. :D
CRAP I just realized that my 8-year blogging anniversary is happening in like 45 days. What should I do, anything special?
I was actually thinking of doing a week of vlogs, but I’m an ugly buttface and no one wants to see that AND the basement’s always filthy and no one wants to see that AND that’s like the week before finals or something and I’ll be in too much of a panic to be entertaining.
Heck, I’m not even entertaining now and it’s spring break!
So today’s topic has been on my mind for awhile, and since I have nothing else that’s dying to be said today, I felt like I could just talk about it now.
At the beginning of last semester I gave the link to my blog to a newly acquired friend (said friend: I don’t know if you still read this, but if you are, you probably know what I’m going to talk about, haha). They read a few pages and came back to me the next day saying that the whole blog felt rather pointless.
Understandably, this rather quick judgment had me kind of miffed. But then I realized that they probably saw my blog as pointless because they’re just looking at the surface of it.
And at the surface, it is pretty pointless. I mean really, there’s not much else here apart from surveys, crappy art, crappy rants, internet stuff, and me obsessing about school/stats/Leibniz/other such things.
But I’ve kept this blog since 2006. That means that I’ve blogged from my last month in high school up until now. That’s about 7.5 years now. That’s a long freaking time, especially when it comes to keeping up a daily routine. And there are a lot of reasons why I’ve decided to keep the routine of blogging and why this blog may seem, at the surface, “pointless.”
- I use this blog to get out some of the extra noise in my head. Doing so may take the form of a rant, a reflection, a discussion about school, whatever. Writing it out just helps to clear some of the clutter.
- I use this blog to post some of my creative things: my writing, my drawings, my “songs.” However crappy they are, I do like to get them out there somewhere.
- I use this blog to talk about my passions. Not a lot of my friends like the things I like. It’s hard to find someone to sit down with and rave about statistics. It’s hard to find someone to quote Achievement Hunter with (actually, since I don’t know anyone in person who’s a fan of them, I can’t really find anyone to do this with) or who will talk about how wonderful Leibniz is. I talk about my passions here because if I didn’t, I’d never get to talk about them at all.
- I use this blog to retain my sanity. I had practically no friends in Vancouver, no friends in London, no friends in Tucson, and very few friends now that I’m back in Moscow. I know it sounds cheesy, but just having somewhere where I could get words out into the public (even if that public was the vast, disinterested internet) made those times manageable.
- I use this blog to record my life. It’s not all of my life by any means, but it’s a decent chunk of it. I can’t count the number of times I’ve been able to use my blog to go back and find the specific date or dates of something important or something school-related. I love having a record of things, and that’s what this blog is.
So yeah. On the surface, all my posts (or at least the vast majority of them) are probably pretty meaningless. But they mean something to me. They’re part of me. And I’m okay with it if you don’t want to read it.
Just please don’t think of it as pointless, because it most certainly is not.
When I started blogging seven years ago, I would never have guessed I would still be doing it in 2013. Never, ever, EVER would have guessed.
Hell, I started it on a whim.
And yet here we are.
I haven’t planned anything ‘cause I’ve been crazy busy and I only remembered that this was my anniversary about an hour ago.
So instead I’d like to thank all my loyal followers, both long-time (MATT!) and newer. I don’t understand why you follow, but I certainly appreciate that you do. Feel free to leave comments. I love comments. I promise I’ll be better about replying to them once this psycho semester is over.
I’d also like to thank Aneel and E’raina who were the two who peer-pressured me into blogging in the first place. Doing so has been very cathartic at times and at other times has just let me get out all the sludge that builds up in my brain to make room for semi-normal thought.
20 more years to go!
2,500 days ago, I published my very first blog post.
That’s a little less than 7 years ago (6 years, 10 months, 4 days), for those of you needing a different metric.
A lot can happen in 2,500 days.
I, for instance (in roughly chronological order):
- Graduated high school
- Got my wisdom teeth pulled
- Went on 2 cruises to Alaska
- Marched a Seattle Seahawks halftime show
- Witnessed the rise of YouTube
- Witnessed the fall of MySpace
- Discovered my passion: statistics
- Got dumped by someone
- Discovered Leibniz
- Roomed with one guy I knew fairly well plus three guys I didn’t know at all
- Dumped someone
- Had the most genuine relationship I’ve ever had
- Earned 2 bachelor’s degrees
- Turned 21
- Got accepted into grad school twice
- Survived swine flu
- Lived in Vancouver, BC
- Witnessed the rise of Twitter
- Ran a 10k
- Went skydiving
- Earned a master’s degree
- Quit a PhD program
- Walked over 1,000 miles in about 11 months
- Saw Mount Rushmore
- Lived in London, ON
- Been (mentally) very, very sick
- Took a year off from school
- Medically withdrew from a PhD program
- Lived in Tucson, AZ
- Got a job teaching statistics
- Went back to undergrad
- Won NaNoWriMo four times
- Turned 25
- Blogged every day
And probably a lot of other stuff that I’m not remembering off the top of my head.
But you know what’s really cool? I’m currently only ONE QUARTER of the way to my goal of 10,000 posts.
Think of all the other things that will happen by post 10,000! That’s not going to be until 2033. What will the world be like in 2033?
I’m excited. That seems so far away.
So I’ve had my domain name (eigenblogger.com) for a year now. The WordPress map was implemented at about the same time:
Every continent except for Antarctica, bitches!
More specific viewing locations from ClustrMap:
I’ve actually got a couple milestones coming up:
10,000 views (since I’ve been on WordPress)
My 2,500th post (a fourth of the way to my goal!): that’s coming up in like 15 days
7-year anniversary: May 1st…it’s awhile until then, but still.
It’s the last day of the big statistics marathon. Sad? I am. But I got a few new R projects coming, so you’ll be subject to those shortly.
Anyway. Today is less about stats analyses and more about just general naked-eye trends. What questions we’re looking at today:
A. What are my most popular blogs by view count on WordPress?
B. What are some of the most popular search terms people have used to arrive at my blog?
C. What are some of the most hilarious search terms people have used to arrive at my blog?
D. Blogs/topics I think are worth sharing that didn’t make my Best Of list up top.
I’ve been on WordPress since September 1st, 2010. Since then, my most viewed blogs have been:
- (153 views) Scrabble Letter Values and the QWERTY Keyboard
- (149 views) Colored Beats!
- (58 views) Oh look, PayPal wants me to fill out a survey
- (34 views) TWSB: Well, it certainly would make the cartographer’s job easier…
- (28 views) TWSB: Weebles Wobble (But They Wouldn’t if They Had Three Legs)
- (26 views) Pi vs. e
- (19 views) An analysis of statewise uniform population density (according to Craigslist)
- (19 views) Claudia’s 365 Days of Music – A Review
- (18 views) 5 x 20 seconds of fun
Those may not seem like tremendously large viewing numbers, but considering I’ve got over 2,000 posts and like three people who actually frequent Eigenblogger, 153’s not too bad. Part B explains some of the numbers.
Speaking of which…
Top 10 search phrases are:
- “colored beats”
- “Leibniz porn”
- “what one thing could paypal have done to improve your experience with the account limitation process”
- le seul mot juste”
- “scrabble letter breakdown”
- “scrabble letter values”
- “scrabble letters”
- “scrabble letter rank”
- “rho rho rho your boat”
Yes, a freakishly large amount of times my blog has been found have been because of somebody (sombodies?) searching for “Leibniz porn.” That is simultaneously awesome and confusing. Does “porn” mean something like “metaphysical texts” in some other language? If not, and at least one person out there is searching for legitimate calculus-oriented, ostentatious wig-wearing, best-of-all-possible-smut Leibniz porn, WHO THE HELL ARE YOU AND WILL YOU BE MY SOUL MATE FOREVER?!
Le Seul Mot Juste was the name of my blog up until like three months ago.
And “rho rho rho?” Who the hell knows. Maybe my intellectually-compatible-perfect-future-boyfriend-husband-thing (hereafter referred to as my ICPFBHT) was trying to make some sort of stats pun as he sat hunched over his computer keyboard in a darkened room, chugging Red Bulls and listening to electronica. Naked. With stacks of Leibniz’ works next to him.
People have found my blog by searching for rather humorous things such as:
- “jokes about leibniz cookies”
- “analysis without anal”
- “paddled in parachute pants”
- “yo dawg science”
- “jokes about godot”
- “if your a noodle and you know it clap your hands” (yeah, I have no idea, either.)
- “ who the hell is millard fillmore”
- “gavagai turnips”
It’s shameless self-promotion time! I was going to make a big ol’ flowchart thing that showed you what blogs to go for depending on your general interests, but I’m lazy and I’m sure none of you readers really care that much, so you get this instead.
Got here via a statistics-related post and/or are interested in random recreational stats parties? Why not check out my blogs under the Statistics category?
Interested in philosophy?
What about science?
(Want to read me bitch about stuff?)
Haha, that’s all I got. So there you go! Six days’ worth of stats for six years’ worth of blogs. I hope to entertain you all for another six years at least.
Thank you for reading! Seriously. I’m not all about acquiring followers, but it is really nice to have regular readers. :)
Today is mega trends day. I’ll be looking at blog-wide stuff like the overall changes in word count and the overall changes in the Gunning Fog Index. Woohoo!
A. The Word Count per blog has increased as time has gone on. That is, my blogs today are longer than my blogs when I first started.
B. The GFI per blog has increased as time has gone on.
C. There is no significant correlation between Word Count and the GFI.
I performed a regression (aka a glorified correlation in this case) between Word Count and Blog Number to determine if the number of words per blog has increased as time has gone on. Which indeed it has; predicting Word Count by Blog Number, the regression equation can be written as Word Count = 0.0613*Blog Number. Blog Number predicts a significant proportion of variance in the Word Count variable, F(1,2190) = 14.15, p < 0.001. Here is a plot. The red line is the regression line. As always, click on those bad boy plots to see them more clearly.
Same procedure for GFI vs. Word Count. The GFI, or Gunning Fog Index, remember, is a measurement of the readability of English writing and its values correspond to the number of years of formal education a person must achieve in order to fully understand the written passage. For example, a GFI of 10 suggests that an individual must have completed 10th grade in order to understand the material. To achieve near universal understanding, Wiki recommends that the GFI of a bit of text hover around an eight.
Anyway. The regression equation here is Word Count = 0.0008639(GFI). GFI predicts a significant proportion of variance in the Word Count variable, F(1, 2190) = 51.86, p < 0.0001. Here is another plot with another regression line.
Finally, I tested the correlation between Word Count and GFI. The correlation was -0.0028 but was not significant with t = -.0.1287, p = 0.8976.
A. Supported! The regression line isn’t very steep, but it’s significant still.
B. Supported! That’s actually a pretty impressive regression line, in my opinion.
C. Supported! There’s practically no correlation at all between the length of my blogs and the level of comprehension. I blame the surveys.
Yay, I’ve been waiting for this day! Why? ‘Cause I get to use Wordle. I don’t have any hypotheses for today; rather, I have three main questions of interest.
Question A: how do my “commonly used words” change throughout the years?
Question B: are there some words I use more than others in my blog titles?
Question C: looking at my blog in total, what are my most commonly-used words?
Question A: Using Wordle’s word counts, here’s a table of my top 10 words for each year (note: Worlde can automatically remove “common” words like the, and, a, etc., so I did that). Words consistently highly used across the years are colored.
(Year 1’s “Andy” is because of a short story I posted. Year 4’s “Hate” is because of grad school.)
My top 10 words I use in my titles are:
- Waiter (from all my “Waiter! There’s a…” titles)
Here is a Wordle of my top 100 words spanning all six years!
I would have guessed I’d used the word “blog” a lot more. And my own name less. I use my name in my blog more than “haha” and I’m always dropping “haha”s all over the place! What.
Bonus: here are a few of my common phrases by year. A lot of these are biased because of one blog containing a repeating phrase, but they’re still amusing.
- “Claudia is”
- “Airplane airplane airplane airplane”
- “Who cares about apathy”
- “ag sci computer lab”
- “if you had sex”
- “the socio-adaptive force”
- “who said hello”
- “I can be absolutely fine”
- “go ahead and stir baby” (haha, it took me like twenty minutes to try and figure out why this was a popular phrase; then I remembered it was because of this)
- “the fact that I”
- “wifey wifey wifey wifey”
- “have you ever”
- “best of all possible” (hahaha, this was the year I discovered Leibniz)
- “the mad scientist’s life”
- “the last time you”
- “I hate this” (yup, grad school time)
- “your conversational partner has disconnected” (and Omegle time)
- “approach to environmental ethics”
- “what do you want”
- “today’s song”
- “this week’s science blog”
- “today’s song”
- “you have no idea”
- “for quite some time”
- “all of a sudden”
- “what do you think of”
- “I miss happiness”
- “what would it be”
- “the last time you”
- “sure why not”
It’s day three!
Today we’re looking at three different variables: trends in my Titles, the frequency of blogs involving Surveys, and the frequency of blogs involving Images.
To make sense of these variables and the stats surrounding them, I had to code them. As I said in my first blog stats-related post this week, for the Titles variable, titles were coded 0 if they had nothing to do with the blog content whatsoever (e.g., “Do obedient consonants respond to a Q queue cue?”), a 1 if they were directly relevant to the blog content (e.g., “Greek letters as broken down by meanings in Statistics: a subjective and torturous endeavor”), and 2 if they weren’t completely unrelated but one couldn’t guess the blog content from the title (e.g., “ZOMG”). For the Surveys variable, I just coded the blog entry as 0 if it didn’t contain a survey and 1 if it did. Same thing for the Images variable—a 0 if there were no images and a 1 if there were one or more images.
So. Do I have any hypotheses? Of course I do!
A: The majority of my blog titles have nothing to do with the blog content (that is, they’re coded as 0).
B: I’ve posted more Surveys as time has gone on.
C: I’ve posted more Images as time has gone on.
D: Blogs with Images have fewer words than blogs without Images.
Quick initial analysis: a pie chart of titles!
Hahaha, a quarter of my blog titles tell you absolutely nothing about the associated blogs. That’s fantastic.
Now some more serious fun. To determine whether the amount of Surveys I’ve been posting has been increasing with time, I first made a graph that looks like a bar code to get a rough idea of the frequency/spacing of surveys in my blog*. Each black vertical line represents a Survey blog (y-axis runs from 0 to 1 but since Survey is coded as either a 0 or 1, the appearance of a line indicates Survey = 1).
Second, I looked at the correlation between Blog Number (blog 1 was May 1, 2006, blog 2,192 was May 1, 2012) and the presence of a Survey. The way the coding works, a positive correlation would indicate that as time progressed, I had a greater tendency to post a survey-containing blog.
In this case, I did get a positive correlation of rpb = 0.071. This isn’t the usual Pearson r correlation because I’m not comparing two continuous variables; rather, it’s a point biserial correlation to accommodate the dichotomously-coded Survey variable. However, it’s mathematically equivalent to the Pearson r, so I felt comfortable running a test of significance on the correlation. Turns out, the little .071 correlation is statistically significant, t = 3.346, p < 0.001. This means that the true correlation between Blog Number and the number of surveys I post is not zero and I’ve been posting more and more surveys as time has gone on.
Taking the same procedure with the Blog Number variable and the dichotomous Image variable, here’s another bar code-esque pic (black lines = blogs containing 1+ image):
Here we get an even stronger correlation of rpb = 0.194, which is statistically significant, t = 9.273, p < 0.0001. This shows that the true correlation between Blog Number and the number of Images my blog contains is not zero, and I’ve been posting more and more blogs containing an image as time has gone on.
Finally, I checked out word count between all blogs with Images and all blogs without Images. I made two subset data sets, one containing all the blogs with images, one containing all the blogs with no images, and ran a t-test. The difference in word count was (to me) surprisingly large and definitely significant, t = 6.658, p < 0.0001. The actual means of the No Image vs. Image blogs were 290.425 words and 177.925 words, respectively.
Hypothesis A: Haha, totally not supported, and actually opposite: most of my Titles ARE directly relevant to the content. That’s…surprising to me. I name my blogs right before posting them (which is usually like a decade and a half after I write them, given how often I update this blog), and I’ve usually used the “mash the keyboard until the letters make sense” approach to titles. That, or “let’s see what dumb pun I can make today!”
Hypothesis B: Supported! This is probably strongly due to the fact that I’m working to complete the 5,000 Question Survey and have been working on it since late 2010.
Hypothesis C: Supported! WordPress makes it substantially easier to include images than MySpace ever did. Also, more time spent on the internet now = more random humorous images found via StumbleUpon/Tumblr/other blogs/etc.
Hypothesis D: Very supported. The actual word count difference between blogs with and blogs without Images was surprising to me, though the sample size difference could probably be considered a culprit. However, I guess it shouldn’t be too surprised, though; going through the archives I found quite a few blogs that were like “here’s an image!”, the image, and nothing else.
*Yeah, I know there’s got to be a more sophisticated way to represent this. Creating a CDF doesn’t work with a dichotomous variable. Maybe if I write a loop that adds all the preceding 1’s to each instance of a 1 it hits as it goes from Blog Number = 1 to Blog Number = 2193, and then create sort of a pseudo-CDF using that…hmm…next week’s project!!!
Yo, blogland! Time for another round of “stats no one cares about except me!”
Today we’re looking at Word Count by Day of the Week, Month, and Year. I’d like to see if there are any general trends or if I blather on about nothing in relatively consistent bursts across time. Maybe if all these days of analyses reveal some trends, I could try fitting a model to this data. I loves me some model fittin’.
Onwards and upwards!
A. No one day of the week will have a statistically significant difference in word count than any other day of the week. I don’t think I blog more or less over the weekend, and I see no reason why any day of the five-day week would have longer blogs than any other.
B. I don’t know if they’ll be significant or not, but I’m predicting that word counts will in general be higher during the spring school months (January– April at least) than the summer/winter months. The more responsibilities I have, the more I turn to blogging for procrastination, and I usually take more credits in the summer.
C. From highest word count to lowest: Year 6, Year 2, Year 5, Year 4, Year 1, Year 3.
Here is a pie chart (a tasty, tasty pie chart!) of the percentage of words I’ve written by the day of the week.
Pretty equal, eh? But what does the ANOVA say? According to the stats, there are no statistically significant differences in word count by day of the week, F = 0.642, p = 0.697. According to the Tukey HSDs, none of the individual pairs of days of the week are statistically significant in terms of their word count, either.
Here is another pie chart. This one shows percentage of words by month.
Again, pretty even. Stats? F = 1.505, p = 0.123, meaning that there are no statistically significant differences in word count by month. No statistically significant differences in any of the pairs of months, either.
Finally, we jump to the largest span of time I’m looking at: years! Pie pie pie pie pie:
Haha, holy crap, Year 6 and Year 2 combined account for nearly half of the words in my total blog. Poor little Year 3.
And finally we see some significance! There is a statistically significant difference in word count by blog year , F = 11.021, p > 0.001.
Hypothesis A: Supported! All days of the week are subject to equal amounts of my blathering. Poor things.
Hypothesis B: Eh. Technically January, February, March, April, and May are the wordiest months, but they’re not significantly so.
Hypothesis C: Woo! I totally called it. If anyone’s curious, Year 3 was a word drought because I was living in the house with the guys and I had…other stuff occupying my time.
More to come tomorrow, ladies and gents!
STATS TIME! Are you excited?
First, I want to preface all of this with the list of variables I kept track of when going through my blog archive:
- Blog Number. My first blog is coded as 1, the second as 2, the third is 3, and so on up until 2193.
- Year. Which blogging year the blog came from. There are six years, each spanning May – May.
- Month. January, February, etc.
- Day. The 1st of the month, 2nd of the month, etc.
- Weekday. Monday, Tuesday, etc.
- Word Count. Word count of each post, not counting the title.
- GFI. Gunning Fog Index.
- Punctuation. How many punctuation marks the post contained.
- Title. 0 = title unrelated to blog content, 1 = title directly relevant to blog content, and 2 = ambiguous title; could be related or unrlated.
- Survey. 0 = blog does not contain a survey, 1 = blog contains survey.
- Image. 0 = blog does not contain any images, 1 = blog contains 1+ image(s)
- Category. What category did I tag my blog as (details below).
ALSO NOTE: significance is always judged at the p = 0.05 level. Just didn’t want to have to keep specifying that. :)
So! Today we’re looking at Categories. There are 35 of them (or there will be once I go through and delete all the old “defunct” tags from the few blogs that still have them). Here’s the list in case anybody gives a crap:
So what are we looking at within this sexy, large dataset with respect to categories, then?
Questions of Interest
A) What is the distribution of the categories? That is, which categories are most popular and which are hardly ever used?
B) Do certain categories have a statistically significant different amount of words per post than the other categories?
A: The most popular categories (by percent) will be Blogging, School, and probably Surveys.
B: The least popular categories will be Ramblings and Sports.
C: Categories with a significantly different number of words per post will be Surveys, Philosophy, and Rants.
D: The three categories specified in Hypothesis C will have higher word counts, not lower.
LET’S DO THIS NOISE.
First up, a pie chart! This was my first attempt at visualizing category percentages. By the way, I definitely would have titled this like a good little statistician, but I couldn’t get the image large enough (in my opinion) with the title included. So I’ll call it Percent of Blogs by Category (NOT percent of words by category; that’s just in the ANOVA below).
I had to screw around with this a lot to get it in the easiest to read color scheme. Pie chart with 35 slices = not the best visual, but I think it’s still better than a bar graph in this case.
Table o’ actual counts (click to blow it up so it’s actually readable, haha):
God, all those Blogging blogs.
Second: ANOVAs! Well, okay, just one. But it’s an ANOVA!
According to a more in-depth, ANOVA-driven analysis…
- The mean Word Count per blog is statistically significantly different depending on blog Category, F = 23.184, p < 0.001.
- Blogs in the Surveys category have a significantly higher word count than the other categories, t = 7.739, p < 0.0001.
- Blogs in the Writing category have a significantly higher word count than the other categories, t = 3.624, p < 0.001.
- Blogs in the Philosophy category have a significantly higher word count than the other categories, t = 3.365, p < 0.001.
- Blogs in the Rants category have a significantly higher word count than the other categories, t = 2.480, p < 0.05.
I (or R, rather) also computed a buttload of Tukey HSDs (595 of them!) to test the mean differences between each pair of categories, but most of the significant ones involved (as expected) Surveys, Writing, Philosophy, and Rants.
Hypothesis A: supported! Blogging and school, man: my life.
Hypothesis B: mostly supported! There were a few categories that had nearly as few entries as Sports. I’d get rid of the Rambling category, but then I’d have 34 categories, which isn’t a nicely-dividable number like 35 (I like numbers ending in 0 or 5). Guess I just need to ramble more.
Hypothesis C: mostly supported! I’d totally forgotten about Writing.
Hypothesis D: supported! Surveys, Writing, Philosophy, and Rants contained blogs that had higher than average word counts.
Tune in tomorrow for more stats no one cares about except me!