11 Nov 2016, 12:58

goiardi version 0.11.0 - Steel Bacon

After years of extensive dwarven research, scientists have concluded that no, pig iron is not related to pigs in any way, and is therefore impossible to craft into ☼steel bacon☼.

It’s still not orgs and whatnot, but after a long simmering process goiardi 0.11.0 - Steel Bacon is now available. The fixes in this release were important enough to release before 1.0.0 is ready.

From the CHANGELOG:

* Ability to upload cookbooks to S3.
* Add script to upload local files to S3 to migrate.
* Change how items are indexed with the postgres indexer, to reduce the number of rows in the search_items table substantially (at the cost of possible differences in search results in a few weird corner cases).
* Search parser no longer chokes on Unicode. Unfortunately Postgres' ltree module does not accept all Unicode alphanumeric characters as valid still.
* Use vendoring.
* Rejigger the package building process a bit - changing how the different packages are built and how version numbers are determined.
* Fix a long-standing annoyance where the log file would get truncated when goiardi started or restarted.
* Allow passing environment variables to goiardi through the config file.
* Fix in-memory indexer to work with go 1.7.
* Add packages for CentOS 6 and 7. Also use a gox fork pulling in someone's PR with better ARM support until that gets merged upstream eventually.
* Change the postgres columns using the 'json' data type to use 'jsonb' instead. This is generally better, but does mean that goiardi now requires PostgreSQL 9.4 or later.

Packages are available at https://packagecloud.io/ct/goiardi, and individual binaries for a variety of architectures can be found on the release page.

The most important parts of this release are being able to upload files to AWS S3 (or compatible services), which you can read more about in the s3 docs, and the improvements with the Postgres search. These search changes reduce the number of rows in the search table by roughly half, at the expense of slightly more complex SQL queries behind the scenes. Upgrades to 0.11.0 from earlier versions of goiardi using the Postgres search will need to reindex the search tables by running knife index rebuild -y.

In any circumstance, earlier versions of goiardi using a version of Postgres older than 9.4 will need to upgrade postgres before updating goiardi and applying the sqitch patches.

This release also brings goiardi binaries for Linux on s390x (a.k.a z/Arch) (because go 1.7 will compile binaries for s390x now). Said binaries appear to work, so the next minor version of goiardi will include Debian and RHEL based packages for that architecture. This way, goiardi will obviously be Ready For The Enterprise™. Binaries for VMS will have to wait until such time as goiardi builds again with gccgo, sadly.

24 Jul 2015, 23:11

goiardi version 0.10.0 - Taihun

“Taihun” was the Gothic word for “ten”. For whatever reason all of the surviving Germanic languages have lost that ‘h’ sound in the middle of the word “ten”, but Gothic preserved it. You can also see some of the usual Germanic sound changes in that word, where a ’d’ sound became a ’t’, and at least some ‘k’ sounds became an ‘h’ sound. Compare Latin decem (bear in mind that the ‘c’ there had a ‘k’ sound originally). You see that sound change as well with “hundred” (compare Latin centum (same thing with the ‘c’)). It also shows a lack of satemization - compare the Russian де́сять (desyat’).

I am pleased to present goiardi version 0.10.0 - Taihun. This release brings the long discussed PostgreSQL based search, along with a search bug fix and some behind-the-scenes changes that won’t impact anyone directly.

From the CHANGELOG:

* Search architecture changed so different search backends can be used (thanks
  oker1 for your work on that).
* Postgres search is here at last! If you're using Postgres, instead of using
  the ersatz solr search, you can instead use Postgres to power your searches.
* Add a mutex for the original goiardi search - multiple simple queries
  executing simultaneously are not a problem, but multiple complex queries can 
  eat up all the RAM on the machine and cause goiardi to crash. This mitigates
  that situation.
* Be a little more forgiving with reporting protocol versions - allow
  specifying the protocol version as a query param instead of only as a
  header. This is to make showing reports with the webui a little easier.
* Bump the Chef Server version we claim to be from 11.1.6 to 11.1.7.

The Postgres search has come a long way since it went into preview. Performance with 10,000 fauxhai generated nodes is a lot more reasonable now, and it appears to be stable and behaving well. If it isn’t, of course, I’d like to hear about it. There is more detailed documentation about the new postgres search available, if you’re interested. The postgres search preview post also has useful information that’s still relevant, if you haven’t seen it yet.

For the future, once 1.0.0 is finished, I also want to make a standalone version of this search à la universe that could work with Chef Server. This would require some changes in erchef to populate the tables, though.

As mentioned in the release notes:

No matter what, whether you’re using the new postgres search or not, if you’re upgrading to this release you’ll need to rebuild your search index. If you’re using the old search, delete your index file first. Then, whether you’re using the old trie index or the new postgres based search, run knife index rebuild -y.

At last, the little issues that kept presenting themselves should all be dealt with. Development on 1.0.0, bringing orgs+RBAC to goiardi, can restart and hopefully be finished sooner rather than later.

22 Jun 2015, 13:02

Postgres search in preview

It’s been stewing for way too long, moving in fits and starts over the last several months, as I’ve been busy with work stuff, real life stuff, and switching jobs stuff, but I’m pleased to announce that a new search backend is available for goiardi. Instead of just using the ersatz Solr trie-based search that goiardi’s had for a while, you can use a Postgres based search backend for Chef searches.

Since this is a pretty big thing, I’m giving it some time to stew in testing before merging it into master and making a release. Right now, it’s in the goiardi 0.10.0-dev branch.

Motivation

Goiardi’s search is fine for smaller workloads, but once you had a few hundred or so nodes in place it began to start chugging and getting very slow and the memory usage would begin getting too high. Solr, on the other hand, while a fine product, never seemed like it was quite the right tool for the job here. I thought that having search be done in Postgres was the way forward.

Testing it out

The postgres search is not yet in master (obviously), so you’ll need to install goiardi from source. Follow the instructions on the installation notes in the goardi documenation. After you run go get, you’ll need to cd into the goiardi directory (so something like cd $HOME/go/src/github.com/ctdk/goiardi), checkout the 0.10.0-dev branch with git checkout 0.10.0-dev, and then run go install github.com/ctdk/goiardi.

Once goiardi is installed, you’ll need to make the configuration file. Use the sample file in etc/ in the goiardi repository. The important settings are the postgres options (obviously) and the “Postgres and advanced search” options in that file. You’ll need to set the local-filestore-dir, and comment out or remove the data-file and index-file options.

To set up the database properly, install sqitch and checkout the pg-search branch of goiardi-schema (or use the not-finalized sqitch postgres bundle inside the goiardi source) as described in goiardi’s postgres documentation.

Once it’s up and running and you can connect to the goiardi instance, it’s time to start playing around. You could slowly create a bunch of nodes, roles, and what have you, but an even easier way to go about it is to use this node builder ruby script. It requires the chef-api (version 0.5.0) and fauxhai gems.

Take that script, fill in your hostname and path to your admin client’s key (making sure you can read it, of course), customize how many nodes you want to create on line 31, and let ‘er rip. It will run for a while, but once it’s done you can start throwing searches at it to see how it performs.

It’s also totally possible to just use it normally as you’d use a chef server, of course.

Caveats

  • Obviously this requires Postgres. The in-mem/file and MySQL data stores have to use the old search.
  • This should be able to handle all normal use cases for Chef search, and almost all weird cases. It’s totally possible to create a situation where you get weird results back, however. The big ones are that fuzzy and distance searches will not behave like you may have hoped, and that because ltree indices in Postgres cannot accept arbitrary characters, but only alphanumeric (plus ‘_‘) characters with ‘.’ as a path separator, goiardi has to convert attributes and search fields with forbidden characters to an acceptable alternative. This shouldn’t be a problem, but it’s true that if you were to have attributes named both “/dev/xvda1” and “dev_xvda1” you might not get the search results you expect. My advice at this moment is “don’t do that”.
  • The goiardi postgres search has been tested with up to 10,000 nodes generated with fauxhai without problems, but it’s very new. It’s quite likely that there are situations where the traditional Chef search with Solr is the better choice. Right now you need to use erchef for the Solr search, but now that it’s possible to add arbitrary search backends to goiardi real solr support may come someday.
  • Ideally, though, the pg-search should be able to handle any search query you toss at it (excepting the issues above with key names, and fuzzy and distance searches). If you find a bug in the Solr query parser where it chokes on a totally legitimate query, please report it as a bug.

Future

This search needn’t be limited to just goiardi. Per the proof of concept seen in the standalone goiardi universe server, this goiardi postgres search could also be carved out and run as a standalone service. Unfortunately that won’t really work until it gets integrated back into the 1.0.0 branch and development there restarts, but it’s an exciting concept. For multi-organization implementation, each organization will probably have their own search schema, rather than keeping them all together. It may or may not ever be a good fit for, say, Hosted Chef, but might work well in a self hosted installation.

Shoutouts

Major props are due for both @oker1, for providing the initial impetus to finally get this done and providing code to split the indexer and search apart from the backends, and @coderanger for giving me ideas and directing me towards using the ltree index instead of jsonb for searching.

01 Jul 2014, 21:36

Goiardi version 0.6.0 - Order of the Elephant

The Order of the Elephant is a Danish order awarded mostly to foreign heads of state and royals. Its elephant-themed insignia is appropriate for this latest release of goiardi, version 0.6.0.

Along with some bug fixes and smaller improvements, this release introduces Postgres as a supported database backend. The schema is not compatible with erchef’s schema, but is close to the goiardi MySQL schema (with some differences of course). This is the first project I’ve done with Postgres, so comments on how goiardi is using it are welcome.

Full CHANGELOG:

  • Postgres support.
  • Fix rebuilding indexes with an SQL backend.
  • Fix a bug where in MySQL mode events were being logged twice.
  • Fix an annoying chef-pedant error with data bags.
  • Event logging methods that are not allowed now return Method Not Allowed rather than Bad Request.
  • Switch the logger to a fork that can be built and used with Windows that excludes syslog when building on Windows.
  • Add basic syslog support.
  • Authentication protocol version 1.2 now supported.
  • Add a ‘status’ param to reporting, so a list of reports return by ‘knife runs’ can be narrowed by the status of the chef run (started, success, and failure).
  • Fix an action at a distance problem with in-memory mode objects. If this behavior is still desirable (it seems to be slightly faster than the new way), it can be turned back on with the –use-unsafe-mem-store flag. This change DEFINITELY breaks in-mem data file compatibility. If upgrading, export your data, upgrade goiardi, and reload your data.
  • Add several new searchable parameters for logged events.
  • Add organization_id to all MySQL tables that might need it someday. Orgs are not used at all, so only the default value of 1 currently makes it to the database.
  • Finally ran ‘go fmt’ on goiardi. It didn’t even mess up the long comment blocks, which was what I was afraid it would do. I also ran golint against goiardi and took its recommendations where it made sense, which was most areas except for some involving generated parser code, comments on GobEncode/Decode, commenting a bunch of identical functions on an interface in search, and a couple of cases involving make and slices. All in all, though, the reformatting, linting, and light refactoring has done it good.

To reiterate, this update breaks saved data file compatibility. You’ll need to export and re-import your data if you’re using the persistent in-memory data store.

A selection of precompiled binaries are provided on the release page, including Windows and Solaris builds. There are also knife plugins for goiardi’s reporting and event logging capabilities at knife-goiardi-reporting and knife-goiardi-event-log. The reporting plugin was forked from the official Chef knife-reporting plugin to add support for the “status” parameter. Both plugins are also available on rubygems.

Finally, after someone asked how goiardi would handle running ~500 nodes, I decided to take a look and see. I whipped up a little test script to create a thousand nodes and clients to see how goiardi (and more importantly the search) would handle it. I wasn’t particularly worried about the non-search portions of goiardi (1000 rows on a table isn’t very much at all, after all), but I hadn’t tested search extensively with large amounts of data. After creating those 1,000 nodes and clients, goiardi was using about 1.40 GB of RAM. An earlier test with 5,500 nodes (but no clients) used 6.45GB RAM. The numbers show that there’s definitely room for improvement with the search, in that the capability for indexing nodes and data bags is limited by RAM, but performance was good even at 5,500 nodes. More rigorous testing is needed, however. Search is an area I intend to revisit shortly to improve these problems.

The next issues to work on with goiardi are currently improving search to work even better with large amounts of data and use raft (or something similar) to allow multiple goiardi instances to keep their search indexes consistent, reworking the permissions system, being able to upload files to s3 or similar storage providers, and the serf based pushy type thing.