What is a Maven Repository?

Contents

Overview

If you’ve just joined the software engineering workforce at a Java shop, or have recently become a Java developer, you may be asking yourself, “What is Maven? Why do I need it?”.

Asking your co-workers, they might respond with something along the lines of, “A Maven Repository is where we store all of our artifacts”.

Artifacts? Are we some type of archeologist?

This article assumes you’re relatively new to the professional world of the Java ecosystem.

Its goal is to answer the following questions which will surely arise in the first couple of weeks at your new job:

  • What is an Artifact?
  • What is Maven?
  • What is a Maven Repository?
  • What about Private Maven Repositories?

Consult the Table of Contents above to jump ahead to the question that is most relevant to you, or read the entire article to gain the most knowledge. Either way, we hope this article will provide you with a helpful intro to the world of Artifacts and Maven.

Before we get started, we’d like to make clear other assumptions that hold for the remainder of this article:

  • We are discussing Java software development practices. All references to anything software related will refer to the Java world.
  • Maven, Artifacts, and other related topics discussed here are used by other programming languages that run on the Java Virtual Machine (JVM), such as Scala (Simple Build Tool - SBT), Groovy (Ant & Ivy), and Clojure (Leiningen & Boot). The concepts discussed here apply similarly to these technologies.

What is a Maven Artifact?

In Maven, an artifact is any type of file that is used in the software development process. The most common of these are Java libraries, also known as ‘JAR files’. Software distribution files, packages, maven project (POM) files , documentation bundles, machine learning models, and any other type of file you can think of can all be artifacts in the maven world.

Maven Artifacts

Artifacts are used in a Java program for many different purposes. When a JAR file artifact is used at compile time it is typically used to bring in Java library code so that code can be reused.

Some artifacts may be packaged with the software and not used until runtime. These can include artifacts which hold data of some sort: images, machine learning models, documentation, language packs, etc.

TLDR: Artifacts are files used by Java programs. They can also be Java programs in the case of self executing archives or other type of executable Java file.

The most common type of artifact you’ll encounter is a dependency - ie. a java library. This brings us to your new friend (and our old friend), Maven.

What is Maven?

Maven is a software development tool which automates dependency management by defining a project object model (POM) to abstracts the structure of a software project. Using POM files to represent the concept of a project, Maven provide a single model of a project that can be used by many different tools in the Java Ecosystem (IDEs, build scripts, etc).

Where is Maven used?

You can find usages of Maven in build scripts, deployment pipelines, continuous integration builds and pretty much any place where JVM based software is created and deployed.

It’s in use by teams large and small, throughout the world to compile software into JAR files, build reports and documentation, and automate many other tasks.

Why should I use Maven?

Let’s take a quick minute and discuss what the Java development process when you’re just getting started.

Let’s say you start a new project for your Facebook-disrupting new app. You open your editor and start writing your Java program. You come to a part in the code where you want to insert data into a database so you have two choice:

  • Research the database protocol and write code that talks directly to the database, or..
  • Find a library which already has implemented the database connectivity.

Which one will you choose? If you have any hope of shipping that new disruptive app, you had better use the library. Why? Because writing database code is non-trivial and will suck down most of your energy before you even get to your actual app’s code.

Software Engineering 101: “Don’t Reinvent the Wheel”

In other words: find a reliable library and use it rather than building your own.

One of the reasons why Java is popular today is because of the wealth of existing open source libraries that are available for you to use.

Other than app specific business logic, most utility code that you will need has been written and is waiting for you to use - you just have to find it!

Okay, so once we find a library that contains the code we need (how you do this could be a completely different article), how do we add it to our program?

Adding Dependencies to Classpath

In Java, we can add libraries to our programs by downloading and adding the JAR files to the Java Classpath.

If you’re using an Integrated Development Environment (IDE) the GUI will guide you.

If you’re a hard core glutton for punishment and are using a basic text editor and using the command line javac and java commands to run your program, you’ll need to add the -cp or -classpath argument to your invocations.

If you only have to add a single library to your classpath it might not be such a big deal, but what if that library you are using requires yet another library (and that one requires another, and so on and so on).

You can quickly end up in what is referred to as Dependency Hell where you will have to download and specify dozens, if not hundreds of different libraries.

Simplifying Classpath Management With Maven

It’s madness, and it sucks, and it’s something that we used to have to do all the time for all of our projects before we started to use Maven.

With Maven, you no longer directly manipulate the classpath or download jars.

Using an XML configuration file, known as a Project Object Model (POM) or POM file, you specify the dependencies your project needs and then let Maven do the rest.

When Maven runs, it will look at the list of declared dependencies and download all of them, including any dependencies that may be implicitly needed, also known as ‘transitive dependencies’.

Once your build has completed, you’ll be able to run your program (at the command line or through an IDE) and your classpath will include all the Jars that were automatically downloaded for you.

It’s extremely simple, relatively straightforward (if you can get past the XML verbosity), and has helped many development teams manage their dependencies in a declarative, repeatable manner.

This is much better than the old days where we’d check in dependencies to version control, yikes! Even worse, it was common to leave off the version number of the jar, so you never really had any idea of which version of a library you actually required!

So, that’s a brief description of one of the most common use cases of Maven.

It can also build your project, bundle up your application, publish it, and do many different things all driven by various plugins that have been written over the years.

Just like java libraries, there is usually a Maven Plugin available for anything you want to do, you just have to find it!

If you were paying attention, you might have wondered where Maven downloads all of these dependencies from. Well, the answer is quite simple: Maven Repositories.

What is a Maven Repository?

A Maven Repository is a location, generally on a filesystem (either remote or local), where maven artifacts are stored and managed. Once artifacts have been stored in a maven repository, they are available for retrieval and inclusion in other maven projects.

Maven Repositories

Just like artifacts, repositories can be called by many different names: Artifact Repositories, Package Repositories, Package Managers, Repository Managers, Binary Repositories, the list goes on and on!

Remote Maven Repositories are web servers which provide simple HTTP and HTTPs endpoints which allow GET and PUT requests for publishing and retrieving Maven Artifacts.

That might seem like it’s a little bit too technical, but we’re all software engineers here after all and HTTP is something we all are familiar with in 2018, we hope!

Where can I find Open Source Libraries and Artifacts?

Java is known for its wealth of open source libraries and most of these libraries are available through Maven Repositories. In particular, the largest store of open source libraris in the Java Ecosystem is the Maven Central Repository.

Maven is configured to check the Central Repository by default so you won’t have to configure your POM files to retrieve them - simply declare your open source dependencies and the Maven command line will take care of the rest!

You can read more about Maven Central and others public repositories in a separate article we’ve written called Public Maven Repositories: Maven Central and More.

What about Private Maven Repositories?

We’ve covered the Maven Central Repository, the place where Maven pulls its publicly available, open source dependencies from, but what about the dependencies that contain our company’s proprietary, private code?

This is where Private Maven Repositories come in.

Private Repositories are just like any other Repository except that they contain a company’s private artifacts.

Typically, a Private Repository will implement access controls, or will be isolated on an internal network, in order to prevent people outside of the company from accessing private artifacts. Historically, most private repositories have been hosted inside a company’s data center or firewall, however with everything moving to the cloud, new cloud based maven repository manager have been developed.

Private Repositories are not exclusively for private artifacts. They can also be used by companies that wish to publish certain artifacts to the public but wish to maintain control over the distribution of these artifacts.

How does CloudRepo fit in?

CloudRepo provides both Private and Public Maven Repositories built on a cloud based, highly available architecture. CloudRepo provides access controls so that you can restrict access to only specific users but also allows for publicly available repositories so that anyone in the public can connect and download artifacts.

CloudRepo also provides the ability to search across your Repositories for any of your Artifacts. A robust user portal is also available for browsing the contents of your repositories.

Thank You, We Hope That Helped!

Our intent with this article was to give you a quick introduction to Maven Repositories, artifacts, Repository Managers, and CloudRepo.

We hope that you found it to be valuable and that if these concepts were new to you (especially if you are starting that new JVM development job) that it made things a little more clear for you.

We have been working with and helping others to learn Maven and all the things related to repositories, build scripts, and more, for many, many years. If you’d like to see any further information, walk throughs, how-to guides, etc. please let us know. We’re always looking for good ideas for quality content and usually the best ideas come from people who are just learning Maven!