Saturday, March 5, 2016

Biking Vashon Island


I've ridden Vashon Island a couple of times with friends now.

It's a hilly 2,000', elevation gain, 31-mile ride with two options to extend it another 2.5 and 10 miles.

I've annotated a map and created a simple cue sheet (link), and this is a route that you can upload into your bike computer or smartphone (link).

There are other places to rest safely and other food opportunities, both noted, but Burton Coffee Stand (20 miles) and Snapdragon Bakery & Cafe (26.5 miles) are an essential part of the experience, especially when taking a respite from the weather.

Thursday, January 28, 2016

Removing Newlines In Quoted Strings

Sometimes double-quoted strings in comma-delimited files (CSV) contain newline characters.

Newline characters embedded within strings really mess up any downstream processing that depends on newlines, well, denoting newlines! So you have to remove them or substitute some other value, such as a space.

I recently encountered this problem, working with a nearly 300,000 line file of election data. Manually editing the file was obviously not feasible.

This turns out to be an extraordinarily difficult problem to solve, for a variety of reasons that I learned serially the hard way:
  • Unix command-line tools work on lines, but the quoted strings span lines! -- You're trying to stitch back together into one line what command-line tools consider two lines. There seem to be some sed and awk solutions, but they seem very complex (link), and I could never get them to fully work in my scenario. (More on that below.)
  • Different *nix operating systems have different versions of command-line tools -- Another problem is that the tips you can find by searching the Web don't work on all Unix variants (e.g., Linux vs. Mac/BSD). So, when things don't work, you're constantly wondering whether you've done something wrong or whether your flavor of Unix is just incompatible with you the version some poster was using and assuming. I ended up installing GNU command-line tools, to try to get around this problem (link). Unfortunately, I still couldn't get the command-line suggestions to fully work in my scenario.
  • Excel isn't available as a tool -- Sometimes you can import data into Excel and do some simple transformations on it there, e.g., using Find & Replace. Again, I couldn't in this scenario, because of some quirks in Excel. On the one hand, if you implicitly import a .csv file into Excel by simply double-clicking on it, Excel does some magic and splices the split rows back together for you! At first that seems like a wonderful bonus, but it turns out that Excel also applies "intelligent" formatting to the columns based on data in them. For example, if it sees data like "01-30" is assumes it's a date. That sounds great, unless the data actually isn't a date, because Excel actually changes the underlying data to be a date, and there's no way (that I know of) to unformat it, to convert it back to the original data! I had some precinct identifiers that looked like dates, so I couldn't leverage Excel's auto import. On the other hand, if you explicitly import a .csv file into Excel--by opening a new worksheet in Excel and choosing File / Import ... , and explicitly identify the data types for each column to get around that auto-formatting problem--Excel doesn't then do the magic stitching of the split lines. It makes no sense, but there seems to be no setting to enable this, no way to get explicit import to work like auto import in Excel in that regard. Sigh. 
Once you've worked your way through all of that, the obvious solution is to write a little script to do the job yourself. How hard could this be? Read a character and write a character with a little state machine in the middle to keep track of whether you're inside a double quote or not. 

It turns out that a couple more problems make that seemingly simple script not so simple:
  • Newlines are invisible or unprintable characters -- The first problem is that newline characters are not visible in some popular text editors like Sublime Text. In others, like TextMate, you can View / Show Invisibles but then the invisibles only show up with little placeholder UI widgets. For example, there's a little grey NOT sign (¬) anywhere there's a newline. But all invisibles show up with the same representation. The much bigger problem, though, is that the carriage return (CR) then line feed (LF) sequence of invisible newline characters shows up as one unprintable character in TextMate! IOW, it can look like there's one newline character when there are, in fact, two, which, of course, can totally confound your otherwise seemingly simple code. What should work simply doesn't ... or doesn't always work in seemingly mysterious ways. Which leads to the last major complication:
  • Different applications, editors, and operating systems represent newlines differently -- A variety of combinations of CR and LF are used (link). Normally, you don't have to pay attention to this. However, when you're working with files coming from different systems (Windows vs. Mac vs. Unix), files produced/consumed by different applications that you don't normally use (I was using an open source GIS application called QGIS), or you're creating new files with text editors (I created some small sample files, by copying and pasted lines from the large file), the representation of newlines can get all mixed up, again confounding an otherwise simple script.  
The Unix command-line dump tool, e.g., od -c <file>, would seem like it would be helpful, but in light of the above in my scenario, not so much. Dumping works great on small sample/test files, and I did this a bunch. But, as I discovered eventually, my sample data represented newlines differently than the big file. od would have showed me that there were, in fact, two characters where I thought there was one in the real data, but the real data was gigantic and the offending data was buried and intermixed within it, so dumping the whole file was not practical or useful.

After beating my head against the wall for quite sometime, trying for the life of me to figure out why the simple Perl script I'd written worked on my sample but not on the real data--even when the sample was a copied and pasted from the original--I finally realized that the sample had single LF newlines while the full file used CRLF newline sequences. The text editor I had used to create the sample file, TextMate, had quietly used a different newline representation for the new file, even though I'd opened the real data file in same text editor. So the script was doing exactly what I wanted it to do, but the data wasn't what I thought it was!

(A side note here: If you go back to the command-line suggestions linked above, I believe they also assume single LF newlines which probably explains why the seemed to work sometimes--in retrospect, on my test data with just LF newlines--but not on my big file--with CRLF newlines.)

So, in the spirit of giving back, I've put the script I wrote in a Gist on GitHub (eat_CRLF_in_quoted_strs.pl) and included the code below, for your convenience. For my application, I chose to simply eat the newlines embedded in quoted strings, rather than replacing them with spaces.

Enjoy!

Wednesday, October 7, 2015

Surly Pacer Top Tube Shim

My performance endurance road bike is a Cervelo R3 with carbon frame. I love it, but it can be a bit of a rough ride on uneven surfaces.

The Surly Pacer is a steel road bike designed for long, all-day rides. In my size (54cm), it has almost the same stack & reach geometry as the Cervelo. IOW, it's basically the same fit. So I bought a Pacer and added fenders, to have a the same ride for winter training and more casual long rides.

The Pacer has one design flaw for long rides though: the rear brake cable runs along the top of the top tube, most of it exposed, which prevents you from using a Bento Box-style top tube bag. On long rides, I like to small amounts relatively regularly, and a top tube bag makes that much easier than jersey pockets. I like the eoGear top tube bags.

So I designed a shim using Tinkercad and printed it on a 3D printer locally using 3D Hubs. The shim rests on the top tube and protects the exposed cable (rear) and cable housing (front). The bag will rest on the shim, instead of directly on the top tube. The design is available here, if you want to print one or modify it.


Updates: Here is the shim (unpainted) with the top tube bag.



And this is the shim painted and affixed with thin two-sided tape. Works like a charm.


A coat of flat black paint makes the shim disappear.

Friday, April 24, 2015

Trying Craft Client or Pro on Mac OS X Before Buying

Craft allows you to try Craft Client and Craft Pro before buying them, by running your site from the host name craft.dev or a subdomain such as site1.craft.dev (link). However, it's not immediately obvious what "running your site from craft.dev" means.

This post explains how to run your site from craft.dev on Mac OS X using MAMP and Sequel Pro.

1 — Install Craft on MAMP

  • The first step is to install Craft Personal. These instructions explain how to do that on Mac OS X using MAMP and Sequel Pro.

2 — Create a directory for the craft.dev site

  • Go to /Applications/MAMP/htdocs
  • Create a new folder trybeforebuy/public/ (or whatever you want to call it)  
  • Copy contents of the htdocs folder there, including the hidden .htaccess file
  • Edit $craftpath in index.php to be '../../../craft'

3 — Setup craft.dev as an Apache virtual host 

  • Follow these video instructions 
  • Use 'craft.dev' (no quotes) as the ServerName
  • Use "/Applications/MAMP/htdocs/trybeforebuy/public/" (with quotes) as the DocumentRoot and Directory 

4 — Change the Web port to 80 from 8888

  • Open MAMP
  • Switch to the Ports tab
  • Click on the “Set the Web & MySQL ports to 80 & 3306” button

5 — Access the new Dashboard 

  • Go to http://craft.dev/admin/login 

6 — Upgrade your Craft installation

  • Click on the "Show me" link in the "Upgrade Craft to take your site to the next level" box at the bottom of your Dashboard
  • Click on the "Test" link by Craft Client or Client Pro, whichever you want to explore
That's it!

This post describes how to switch between the different versions--Craft Personal, Craft Client, and Craft Pro--if you want to do that.

Saturday, April 4, 2015

Installing Craft CMS on Mac OS X Using MAMP & Sequel Pro

I just wanted to set up Craft CMS on my Macbook to experiment with. It took me forever. This explains what worked (and what was missing from other sets of instructions).

Craft's install documentation assumes but doesn't explain several prerequisites (like setting up my MySQL) and The Absolute Beginners Guide to Setting Up Craft on Mac (sounded perfect) was directionally helpful but had some errors and, at the end of the day, simply didn't work for me. For the longest time, I could not get Craft to connect to MySQL, even though it was installed correctly and running.

These are the perquisite steps, assuming you aren't already running Apache and MySQL:
  • Step P1: Download and install MAMP -- This sets up a Macintosh, Apache, MySQL, and PHP solution stack and gives you a simple management interface to start and stop the servers.
  • Step P2: Download and install Sequel Pro -- This gives you a simple tool to manage your MySQL databases.
  • Step P3: Open MAMP (not MAMP Pro), and click "Start Servers" -- This starts the Apache Web server and the MySQL database engine.
  • Step P4: In MAMP, click "Open WebStart page," if MAMP didn't already open one in your browser automatically.
  • Step P5: Open Sequel Pro, and use the parameters on MAMP's WebStart page to create a connection -- I could not create a Standard connection using '127.0.0.1'. I had to create a Socket connection, using 'localhost'. This is the scenario that the Craft installation instructions don't anticipate.
  • Step P6: Click the control in the upper lefthand of Sequel Pro, and choose "Add Database ..." to create a database, e.g.. 'crafted'. Choose UTF-8 for “Database Encoding."
These steps track those in Craft's installation instructions but with some additional explanation. Read them together:
  • Step C1: Upload the files -- You're uploading the files, because you're pushing them to your Apache Web server on your Macbook.
    • Download Craft -- It will end up in your Downloads folder. There are two folders craft/ and public/ in the downloaded 'Craft-2' folder.
    • Copy the craft/ folder to /Applications/MAMP, i.e., above the webfoot (htdocs/). 
    • Copy the files htaccess, index.php, and robots.txt from public/ to the htdocs/ folder.
    • Rename htaccess to .htaccess -- Open Terminal and run the following commands:
      • "cd /Applications/MAMP/htdocs" -- go to where htaccess lives 
      • "defaults write com.apple.finder AppleShowAllFiles -bool true" -- show hidden files
      • [Relaunch the Finder] -- hold down control-Alt while you click on Finder in the dock and then choose Relaunch
      • "mv htaccess .htaccess" -- rename htaccess
      • "defaults write com.apple.finder AppleShowAllFiles -bool false" -- hide hidden files
      • [Relaunch the Finder again]
  • Step C2: Set the permissions
    • Open Terminal and run the following commands:
      • "cd /Applications/MAMP/craft" -- go to the craft/ directory
      • "chmod -R 744 app"
      • "chmod -R 744 config"
      • "chmod -R 744 storage"
  • Step C3: Create your database 
    • Nothing to do here -- You already did this in Step P6 above.
  • Step C4: Tell Craft how to connect to your database
    • Copy the default settings in "craft/app/etc/config/defaults/db.php" into "craft/config/db.php"
    • Update the parameters
    • Set 'unixSocket' to the Socket value on your MAMP WebStart page (e.g., "/Applications/MAMP/tmp/mysql/mysql.sock"). 
  • Step C5: Run the installer! 
    • The URL is "localhost:8888/admin"
    • Follow the installation screens
I'm guessing that if you are setting up Craft in a production environment on a production Mac you might want to do somethings a little different, but this will get you a sandbox to play in. Enjoy!



Monday, November 3, 2014

Seattle Rides


After an almost 20-year hiatus, I've started biking again.  These are the major rides I took this past summer:

  • Seattle Loop (link) — 25.5 miles total from home
  • Gas Works Parks to Seward Park and back (link) — 25.5 miles from Gas Works plus 8 miles roundtrip to/from Gas Works (via Roy, Mercer, and the path along Westlake)
  • Gas Works Park - Mercer Island Loop (link) — 31.5 miles from Gas Works; the Mercer Island loop is really nice, especially on a weekday when few others are riding (very popular on nice weather weekends)
  • West Seattle - Leschi - Gas Works Loop (link) — 37 miles total; you can connect into the loop right at the bottom of 3rd AVE W on the other side of the pedestrian bridge
  • North Lake Loop (link) — 41 miles from Gas Works Park; you ride clockwise around the north half of the lake, coming back over I-90
  • South Lake Loop — 38 miles from Gas Works; I don’t have a map for this and haven’t ridden it recently, but you’d basically ride over to Gas Works Park, start riding counterclockwise around the lake to I-90, and then you’d ride around the southern half of the lake (crossing it over I-90) either clockwise or counterclockwise, and then back to Gas Works
  • On either of the previous two rides, you can add another ~8 miles by looping around Mercer Island when you cross I-90
  • Lake Washington Loop (link) — 53 miles from Gas Works
  • Chilly Hilly (link) — 32.5 miles roundtrip from/to the Bainbridge ferry terminal
I had some fun rides with friends, but the twin highlights of this season were the full Lake Washington Loop and the Chilly Hilly route with my daughter, Sara.

Next year I hope to do some longer rides out through Snohomish, Monroe, Woodinville, Redmond, Issaquah, etc., as well as some hillier rides on Vashon Island.

UPDATE: I posted an annotated map and cue sheet along with a route (.gpx) for some rides around Vashon Island (link).

Friday, September 19, 2014

The nyrb game


I love great books, reading great, physical books.

And I love browsing books in bricks-and-mortar stores. When you know exactly what you want, buying books online can make sense, but nothing beats discovering books and authors by physically browsing racks of books.

But browsing in a decent bookstore can be overwhelming, because there are so many titles and the only way to meaningful filter them is using very broad categories (fiction vs. nonfiction and categories within nonfiction, e.g., health, business, art, music, computers, etc.).

Until recently the vast majority of my reading had been nonfiction, nerd stuff. Earlier this year though a good friend turned me on to the NYRB Classics series. While it includes both fiction and nonfiction, it is heavily weighted to the former (browse titles).

The books in the series are both extremely well written and underappreciated, out of print before becoming part of the series. The NYRB curates the titles and then reprints them. Hence, the NYRB Classics imprimatur is a wonderful way to discover great books and authors--the books are, in a word, superb.

I read 10 while taking this summer off:
  • Kingsley Amis - Lucky Jim
  • Elaine Dundy - The Dud Avocado
  • J.G. Farrell - The Siege of Krishnapur
  • J.G. Farrell - The Singapore Grip
  • J.G. Farrell - Troubles
  • Gabriel García Márquez - Clandestine in Chile
  • Ernst Jünger - The Glass Bees
  • Raymond Kennedy - Ride a Cock Horse
  • Elizabeth Taylor - Angel
  • Christina Stead - Letty Fox: Her Luck
They're some of the best books I've ever read, and I'd never even heard of most of the authors.

The game is finding the NYRB Classics titles in physical bookstores. Each title has this distinctive logo on its spine:
This makes scanning bookshelves to find these diamonds in the rough both easy and fun.

My favorite variation of this game is at used bookstores where you can frequently find used-but-new (remaindered) NYRB Classics titles very inexpensively (~$7). I scan the entire fiction section and buy all the NYRB Classics titles that I don't already have. I've done this a few times, and walked out with 8, 9, 10 books and spent well under $100!

I find it sad that such wonderful books are remaindered in the first place--not more in demand--but finding these used-but-new titles makes for a very enjoyable game nonetheless. 

And, who knows? Maybe if lots of people start playing the used bookstore NYRB game and get hooked on the series, they'll start looking for and buying new NYRB Classic titles and fewer will get remaindered.