Sunday, 21 August 2016

Building Full-Text Search with PostgreSQL and Phoenix

I’ve been building an Elixir/Phoenix job board over the last two weekends–I’m a big fan of the ecosystem and I wanted to prototype a little project that benefited the community at large. One useful aspect of a job board is searching – being able to query for “Los Angeles” and returning jobs that are actually in Los Angeles, instead of scrolling through a list looking for each location.

Phoenix uses PostgreSQL by default, which is great for building a search solution. While Postgres has a full documentation section about full-text search, I wanted to quickly give an overview of how I specifically added full-text search to my Phoenix application.

(I want to note here that in my instance, I’m searching across a single model Job – if you need to search across multiple models, some of the code below will need additional configuration)

We need to determine what fields are going to be searchable. For a Job, those fields are “title”, “company”, and “location”. We’ll create a Phoenix migration to index those fields, using Postgres’ GIN indexes – a specific type of index designed to speed up full-text search. Read more in the Postgres documentation, if you’re interested.

With the fields indexed, we’ll create a Postgres “view”, which will create a singular access point to searching all the fields on our model:

(Note here that if you are using multiple models, you should also UNION here with a SELECT from each model… the pattern for each field should be pretty straightforward)

With this view created, we can test the query in psql:

As we can see, the searches view formats each field on a model as an individual record to be searched on. Each of these fields returns a searchable_id and searchable_type, which we can use to look up the model itself.

Let’s add a search query. To do this, we’ll make use of the to_ts* (text search) utilities Postgres provides, for transforming an input string and matching it against the text that we’re searching. Note that at any point in the code, you can test out the Postgres to_tsvector/to_tsquery functions in the psql interface – we’ll breeze through them a bit as the Postgres documentation covers them pretty well.

In this instance, we can see that passing in the query acme has matched against the Job record with the term “Acme Co”. One thing I realized here is this view doesn’t include the column that is matched. In this case, this record has matched our query with a term on the company field, but we have no way of knowing that. I’d encourage you, if you need that information, to revise the “searches” view we implemented above.

One more thing – if you test the above SQL with the query “acme co”, you’ll notice it fails. This is because Postgres text search queries have interesting behavior around spaces – instead of “acme co”, we need to provide either “acme&co” (match a term with “acme” AND “co”) or “acme|co” (match a term with “acme” OR “co”). You can determine what kind of search you’d like – in my application’s case, I decided to use |, so I could do searches like “acme|los|angeles|engineer” and match multiple terms at once.

At this point, we can begin adding code in our Phoenix project to allow searching directly in the web application.

We need to do two things in our application: first, add code to execute and load models from SQL, and using that code, execute something like the above SQL to receive a set of models based on a query.

I elected to use a solution like the one in this StackOverflow post, in lib/<yourapp>/repo.ex:

With that implemented, we can add a search/1 function, which accepts a query and returns a set of Jobs matching that term:

The above code is pretty straightforward – first, we take the input term and transform it to use | instead of spaces (remember “acme|co” versus “acme&co”. We then use pretty similar SQL to what we used in the console – we search using the “searches” view for a term, select the searchable_id for all of the returned results, and then, using those ids, look up the actual Job rows using their ids. The execute_and_load function accepts a SQL statement, any variables, and finally, a model to load the data in. This is an Ecto thing that I can best explain as making a strongly-formatted Map of your model fields (like %Job{title: nil, company: nil, location: nil}), and then taking the SQL output to fill out that map.

With that, we can boot up a iex console and make sure that we get the data we expect:

What a neat little way to build powerful text search! I’ll leave the remainder of setting this up, UI-wise, as an exercise to the reader. It could be as simple (as it is in my case) as a controller function that accepts query as a param and passes it to this function.

This guide was built with Elixir 1.2 and Phoenix 1.2.1 – if you’re on a previous (or later) versions of Elixir, Phoenix, or Ecto, this may not work as expected – sorry about that.

Tuesday, 16 August 2016

Integrating Mixpanel into a React and Redux application

I did some work recently on integrating Mixpanel into a project I’ve been building. Integrating Mixpanel was a lot easier than I thought, but there were a couple complicated bits in getting it to work as expected with a React / Redux application. Here are the steps:

Installation

Import the mixpanel library and initialize it somewhere near the beginning of your app’s component lifecycle. I did this in my AppContainer (using the usual React component lifecycle methods), which renders out the router and components:

Identification

Identify your user and pass any additional information (name and email, for instance). This can be a tricky one depending on how your application’s redux flow is structured–it would make the most sense to do it as part of an action, but depending on your app’s concerns (persistence being a primary issue for my app), it might need to also happen in the reducer. Here’s how it’d look in an action:

In my application, the store is persisted into local storage using redux-persist, so in the reducer, separately handling an already-logged in user via the store rehydrating looks like:

Tracking events

With an identified user, you can begin tracking events in your actions:

If you need to act on the tracking event finishing, you can wrap the mixpanel call in a Promise and chain it:

Overall, I’m pretty happy with how the Mixpanel + React integration turned out. The user identification feels gross with redux-persist, but when doing the identification as part of the login and sign up actions, it feels more intuitive than it would be than firing it as part of a button click callback.

Thursday, 11 August 2016

A nice touch in Mr. Robot

Are you watching Mr. Robot? I talked about it in the last post – it’s probably my favorite show on TV right now. I’m a bit behind on season 2, but I’ve been rewatching the first parts of season 1 over the last few days, and realized something pretty amazing…

[spoilers ahead]

Continue Reading

Tuesday, 12 July 2016

Archive and delete

I began re-watching season one of Mr. Robot (Prime Video link) again, in preparation for the upcoming second season. One of my favorites scenes of the show’s premiere is Elliot’s fsociety speech:

Some of it is a bit cynical, but re-watching it made me appreciate (again) the core message: a lot of what we do online is vapid. I’ve begun re-evaluating some of the websites and apps that I use on a daily basis – it was already on my list to-do, anyway – and I identified Facebook as the worst of the bunch. With Twitter, I stay in touch with people I don’t know in real life: we don’t text, we don’t meet in person. Facebook is different: it’s people I know in real life, but we never interact on Facebook itself. I found that with the exception of meaningless likes on posts, I never interact on Facebook.

So I archived my account data and de-activated. The account isn’t gone forever (I’m still debating this one), but it is out of my daily cycle of meaningless websites to stare at every day. On to better things.

Saturday, 25 June 2016

Securing WordPress with YubiKey

I recently picked up a YubiKey Nano, and I’ve been really happy with it so far:

One useful integration has been with WordPress (this site). There’s a few plugins available to use, but I thought I’d share that I had the best success with the “Two-factor” plugin. It’s intended to eventually be merged into WP core, so it’s a safe bet for a reliable, long-term plugin (I think).

Setup is easy: install the plugin (just search in the plugin database), and configure two-factor settings in the “Users” section of your WordPress admin. I configured two options – YubiKey (using the U2F protocol) and Google Authenticator (standard mobile six-key 2FA).

The support for YubiKey and USB security dongles in general is growing – the impetus for purchasing my YubiKey was support in GitLab, though you can check a fairly up-to-date list at dongleauth.info.

Saturday, 11 June 2016

New music

I’ve been writing a lot of music lately, especially focused on actually getting songs out the door. I’m happy to say the process has been really positive – the newest tune, “Paddle”, is out on Soundcloud, and is probably my favorite thing I’ve worked on recently. Check it out!

Thursday, 2 June 2016

Getting Started with Emmet now available on Pluralsight

My newest course with Pluralsight, “Getting Started with Emmet”, is now available. A quick summary:

Emmet, a toolkit for web developers, integrates with your text editor to rapidly increase your speed in laying out HTML and CSS. This course is an introduction to Emmet and how developers can use it, including making custom shortcuts and macros.

It’s a shorter course but it can have big implications on how you write HTML and CSS day-to-day. Check it out on Pluralsight:

https://app.pluralsight.com/library/courses/emmet-getting-started/table-of-contents

Wednesday, 27 April 2016

The best is the worst

I was baptized into the cult of programmer “tooling” around day one. It was simple–which text editor should I use? The answer is… complicated.

There are two high-level categories of programming tools: software and hardware. Software, like Vim, Emacs, and Git, are often free (and open source). People have strong opinions about software because they can often just install them and try them. 

I know I like Git because I’ve used it in projects over the span of many years. It’s been hugely helpful. I would have had less opportunity to use and recommend Git if it had a $1,000 a year license.

Here’s the flipside of that scenario: because a lot of these tools are free, we also have the luxury of ignoring opinions, and just trying things ourselves. If someone hates Git, it might just be that it doesn’t work for them, or their method of working. It might still work for you.

Hardware is, pardon the pun, harder. It exists in the world as an actual thing. It has a cost attached to it. This makes it difficult to test. We have proxies–reviewers, who have actually used the product. I might not be able to go out and buy a $2,500 laptop for another six months, so I have to find a technology site that has reviewed it (objectively, we would hope) and determine whether it’s worth my money.

This is why hardware is a hard value proposition. Here’s a statement that I know is true for me – it may not be for you, and that’s okay. Ready? Expensive seems better. 

(I know, it sounds terrible, but bear with me! I’m having a sea change here.)

Would I pick a $2,500 MacBook Pro, or a $500 Chromebook? Hello, Apple.

Would I pick a $100,000 Tesla, or a $20,000 Honda? Hello, Elon. 

Would I pick a $400 ergonomic, mechanical keyboard, or a $40 Logitech keyboard made of mostly plastic? Well…

I’m trying to be better about this, and the above example is a real-life example of that. Here’s my recent revised version of that statement, the mantra that I have to keep repeating to myself: 

Disposable is better.

Huh?

If I can build products with the Chromebook, isn’t it better to have something I have almost no attachment to? If I’m sitting at a coffeeshop and someone runs by and grabs the laptop, will I feel worse knowing that I just lost $2,500, or just $500?

I’m trying this with any new purchase nowadays. My custom-built ErgoDox keyboard that has a custom split-hand layout? Honestly? It works like 60% of the time, and I paid $400 for it. The Logitech keyboard I’m typing this on was $40. I can break this keyboard ten times and it won’t hurt as bad as the ErgoDox. 

Attachment is hard, and a lot of the time, it’s a distraction. This year I’m trying really hard to stop focusing on useless things like what phone or laptop I have, and start getting down to building stuff. Join me?

Thursday, 21 April 2016

I made it!

I mentioned a couple months ago that my girlfriend and I were going to move to Los Angeles! We finally made it here. After a beautiful drive down I-5 over the course of a weekend, this marks the end of my first work week in LA. I’m excited to be in the city and start exploring a bit. It’s going to be a really incredible year–I’ve been heads-down on a few things over the last couple months, and I’m excited to come up for air soon with a couple of them.

Monday, 11 April 2016

We don’t have time for assholes

José Valim, one of the leads at Plataformatec and creator of the Elixir programming language, has left Twitter:

Around the same time, this Gist (URL, not embed, because the comments are the relevant part to this post) caught my attention, which seems to be directly related to José’s departure.

The comments are a frustrating read. It’s not worth giving trolls the attention they want, so I’ll speak without mentioning anyone in particular…

We don’t have time for assholes. Elixir is a cool language, and the Phoenix web framework does a lot of things really well. Like any software project, it’s not perfect, and neither are the maintainers. If something doesn’t work like you want it to in open source, show us the code.

As soon as you begin ad hominem critiques on the person, instead of the project, you lose credibility. So when you write things like this:

Yes, there’s a maturity problem in the Elixir community. But it’s not just a matter of the inexperience of its technology. it’s also a matter of character of its leaders. And in the time that it takes a human being to develop indefatigable and steadfast character, a software platform with great potential can easily be laid waste with the personality cult sideshow that completely distract from quality and sensible design decisions.

It comes across as petty – why aren’t the people that run this project perfect, virtuous humans? Well, because none of us are. What a silly requirement to use a software project.

We really don’t have time for assholes. Find people who, even if they are responsible for challenging conversations, produce a net positive in your day-to-day.