James Westall

A first look at Decentralised Identity

As an identity geek, something I’ve always struggled with has been user control and experience. Modern federation (such as OIDC/SAML) allows for generally coherent experiences, but it relies on user interaction with a central platform, removing control. Think of how many services you log into via your Facebook account. If Facebook was to go down, are you stuck? The same problem exists (albeit differently) with your corporate credentials.

Aside from centralisation, ask yourself, do you have a decent understanding of where your social media data is being shared? For most, I would guess the answer is no. Sure you can go and get this information, but that doesn’t show you who else has access to it. You own your identity and the associated data, but you don’t always control its use.

Finally, how many of your credentials cross that bridge into the real world? I would posit that not many do. If they are, it’s likely some form of app or website.

Enter Decentralised Identity

Thankfully, with these challenges in mind, the Decentralised Identity Foundation (DIF) has set to work. As a group, the foundation is working to develop an open, standards based ecosystem for scalable sharing and verification of identity data. At its core, the DIF has developed standards for Decentralised Identifiers (DIDs) as a solution, with varying secondary items under differing working groups. So, how does Decentralised Identity work?

In short, it uses cryptographic protocols and blockchain ledgers to enable verifiers that validate a user claim without talking to the original issuer. The owner of each claim holds full possession of the data, and presentation of the data in question requires the owners consent.

Explaining Decentralised Credentials.
Source:
https://sovrin.org/wp-content/uploads/2018/03/Sovrin-Protocol-and-Token-White-Paper.pdf — DID High level summary

In English Please?

A few excellent real world examples exist for where this Decentralised Identity could easily be applied. Say you (the owner) are an accredited accountant with Contoso Financial Advisors (the issuer). As a member, you are provided a paper based certificate of accreditation. On a job application, you provide this paper based record to a prospective employer (the verifier).

From here, your employer has a few options to validate your accreditation.

You have the “accredited” paper, so you must be legit right? Without verification of your accreditation, the work you put into obtaining it is invalidated.
Look for security features in your accreditation. This is vulnerable to fraud, with some documents not containing these features.
They can contact Contoso to check the validity of your accreditation.This relies on Contoso actually having a service to validate credentials while still operating.

As you provide this accreditation to the employer, you also have a few concerns about the data contained within. What if they take a copy of this accreditation for another purpose? What if they also sell this information?

Combined, the current options for validation of identity ensure that;

Any presented data is devalued, through either lack of verification or fraud,
Data control is given away without recourse,
A reliance is built on organisations to be permanently contactable.

DID’s work to solve these challenges in a few ways. A credential issued within the DID ecosystem is signed by both the Issuer and the owner. As this signature information is shared in a secure, public location anyone can complete a verification activity with a high degree of confidence. As only the verification data is held publicly, you (the owner) can provide data securely, with the verifier unable to pass this information onto third parties with an authentic signature.

Finally, if Contoso was to close down or be uncontactable, the use of a decentralised leger allows your employer to verify that you are who you say you are. The ledger itself has the added benefit of not requiring ongoing communication with Contoso, meaning they also benefit as they no longer have to validate requests from third parties.

Azure AD Verifiable Credentials

As a new technology, I was quite enthused to see Microsoft as member level contributor to the standards and working groups of the DIF. I was even more excited to see Microsofts DID implementation “Azure AD Verifiable credentials” announced into public preview. Although it is still a new service and the documentation is light on, I’ve been able to tinker with the Microsoft example codebase and have found the experience to be pretty slick.

To get started yourself, pull down the code from GitHub and step through the documentation snippets. Pretty quickly, you should have a service available using ngrok and a verifiable credential issued to your device. Look at me mum, I’m a verified credential expert!

Using the example codebase, you should note that the credential issuance relies on a Microsoft managed B2C tenant. The first step for anyone considering this technology is to plumb the solution into your own AAD.

To do so, you first need to create an Azure Key Vault resource as the Microsoft VC service stores the keys used to sign each DID. When provisioning, make sure you have key create, delete and sign permissions for your account. Without this, VC activation will fail.

Next, you need to navigate through to Azure AD, then enable VC under: Security> Verifiable Credentials (Preview).

Verifiable Credentials Enablement screen

Take note, if you plan to verify your domain it must be available over https and not utilise redirects. This held up my testing as my container hit the lets-encrypt provisioning limits.

Once you have enabled your environment, you need to create a rules and a display file before activating a new credential type. These files define what your credential looks like and what must be completed to obtain one. I created a simple corporate theme matching the Arinco environment, plus a requirement to log into Azure AD. Each rule is defined within an attestations block, with the mapping for my id token copying through to attributes held by the VC. One really nice thing when testing out basic capability is that you can create an attestation which takes only user input, meaning no configuration of an external IDP or consumption of other VC is required.

My Rules File

{
  "attestations": {
    "idTokens": [
      {
        "mapping": {
          "firstName": { "claim": "given_name" },
          "lastName": { "claim": "family_name" }
        },
        "configuration": "https://login.microsoftonline.com/<MY Tenant ID>/v2.0/.well-known/openid-configuration",
        "client_id": "<MY CLIENT ID>",
        "redirect_uri": "vcclient://openid/",
        "scope": "openid profile"
      }
    ]
  },
  "validityInterval": 2592000,
  "vc": {
    "type": ["ArincoTestVC"]
  }
}

My Display File

{
  "default": {
    "locale": "en-US",
    "card": {
      "title": "Verified Employee",
      "issuedBy": " ",
      "backgroundColor": "#001A31",
      "textColor": "#FFFFFF",
      "logo": {
        "uri": "https://somestorageaccount.blob.core.windows.net/aad-vc-logos/Logo-SymbolAndText-WhiteOnTransparent-Small.png",
        "description": "Arinco Australia Logo"
      },
      "description": "This employee card is issued to employees and directors of Arinco Australia"
    },
    "consent": {
      "title": "Do you want to get your digital employee card from Arinco Demo?",
      "instructions": "Please log in with your Arinco Demo account to receive your employee card."
    },
    "claims": {
      "vc.credentialSubject.firstName": {
        "type": "String",
        "label": "First name"
      },
      "vc.credentialSubject.lastName": {
        "type": "String",
        "label": "Last name"
      }
    }
  }
}

Once you have created your files, select create new under the credentials tab within Azure AD. The process here is pretty straight forward, with a few file uploads and some next-next type clicking!

Verifiable Credentials Provisioning Screen

Once uploaded to Azure AD, you’re ready to build out your custom website and test VC out! The easiest way to do this is to follow the Microsoft documentation, updating the provided sample, testing functionality and then rebranding the page to suit your needs. With a bit of love, you end up with a nice site like below.

And all going well, you should be able to create your second verifiable credential.

The overall experience?

As verifiable credentials is a preview service, there’s always going to be a bit of risk associated with deployment. That being said, I found the experience to be straight forward with only a few teething issues.

One challenge I would articulate for others is while provisioning https certificates do not configure DNS for your DID well known domain. This causes authenticator to attempt to connect over https with the user experience slowed by about two to three minutes of spinning progress wheels while the application completes retries.

As for new capability, I’m really looking forward to seeing where the service goes with my primary wish list as follows:

Some form of secondary provisioning aside from QR Codes. I personally don’t enjoy QR due to a leftover distaste from COVID-19 contact tracing in Australia. A way to distribute magic links for provisioning, or silent admin led provisioning, would be really appreciated.
Any form of NFC support. To me, this is the final frontier to cross for digital/real world identity. Imagine if we could use VC for services such as access to buildings, local shops or even public transport.

Hopefully, you have found this article informative. Until next time, stay cloudy!

Connecting Security Centre to Slack – The better way

Recently I’ve been working on some automated workflows for Azure Security Center and Azure Sentinel. Following best practice, after initial development, all our Logic Apps and connectors are deployed using infrastructure as code and Azure DevOps. This allows us to deploy multiple instances across customer tenants at scale. Unfortunately, there is a manual step required when deploying some Logic Apps, and you will encounter this on the first run of your workflow.

This issue occurs because connector resources often utilise OAuth flows to allow access to the target services. We’re using Slack as an example, but this includes services such as Office 365, Salesforce and GitHub. Selecting the information prompt under the deployed connector display name will quickly open a login screen, with the process authorising Azure to access your service.

Microsoft provides a few options to solve this problem;

Manually apply the settings on deployment. Azure will handle token refresh, so this is a one time task. While this would work, it isn’t great. At Arinco, we try to avoid manual tasks wherever possible
Pre-deploy connectors in advance. As multiple Logic Apps can utilise the same connector, operate them as a shared resource, perhaps owned by a platform engineering group.
Operate a worker service account, with a browser holding logged-in sessions. Use DevOps tasks to interact and authorise the connection. This is the worst of the three solutions and prone to breakage.

A better way to solve this problem would be to sidestep it entirely. Enter app webhooks for Slack. Webhooks act as a simple method to send data between applications. These can be unauthenticated and are often unique to an application instance.

To get started with this method, navigate to the applications page at api.slack.com, create a basic application, providing an application name and a “development” workspace.

Next, enable incoming webhooks and select your channel.

Just like that, you can send messages to a channel without an OAuth connector. Grab the CURL that is provided by Slack and try it out.

Once you have completed the basic setup in Slack, the hard part is all done! To use this capability in a Logic App, add the HTTP task and fill out the details like so:O

You will notice here that the request body we are using is a JSON formatted object. Follow the Slack block kit and you can develop some really nice looking messages. Slack even provides an excellent builder service.

Block kit enables you to develop rich UI within Slack.

Completing our integration in this manner has a couple of really nice benefits – Avoiding the manual work almost always pays off.

No Manual Integration, Hooray!
Our branding is better. Using the native connector does not allow you to easily change the user interface, with messages showing as sent by “Microsoft Azure Logic Apps”
Integration to the Slack ecosystem for further workflows. I haven’t touched on this here, but if you wanted to build automatic actions back to Logic Apps, using a Slack App provides a really elegant path to do this.

Until next time, stay cloudy!

Integrating Kubernetes with Okta for user RBAC.

Recently I built a local Kubernetes cluster – for fun and learning. As an identity geek, the first thing I assess when building any service, is “How will I log in?” Sure I can use local Kubernetes service accounts, but where is the fun in that? Getting started with my research, I found a couple of really great articles documents and I would be amiss If I didn’t start by sharing these;

Kubernetes Authentication doc – RTFM always
Kubernetes & Google OIDC by Benji Visser
Kubectl with OpenID Connect by Hidetake Iwata -> Closely aligned to this post

I don’t think it needs acknowledgement, but the K8s ecosystem is diverse, and the documentation can be daunting at the best of times. This post is slightly tuned to my setup, so you may need to tinker a bit as you get things working on your own cluster. I’m using the following for those who would like to follow along at home;

A MicroK8s based Kubernetes cluster (See here for quick setup on raspberry pi’s)
An Okta Developer tenant (Signup here – Free for the first 5 apps)

Initial Okta Configuration

To configure this integration for any identity provider, we will need a client application; First up – create an OIDC application using the app integration wizard. You will want a “web application” with a login URL that looks something like so

https://localhost:8000

Pretty straight forward stuff, customise the name/logo as you like. Once you pass the initial screen, note down your client ID & secret for use later. Kubernetes services often are able to refresh OIDC tokens for you; to support this, you will need to modify the allowed grant types to include Refresh – A simple checkbox under the application options.

Finally, assign some users to your application. After all, it’s great to be able to login with OIDC tokens, but if Okta won’t assign them in the first place, why bother? 😀

Modifying the Kubernetes API Server

Once you’ve completed your Okta setup, it’s time to move to Kubernetes. There is a few configuration flags needed to configure the kube-apiserver, but only two are hard requirements. The others will make things easier to manage in the long run;

--oidc-issuer-url=https://dev-987710.okta.com/oauth2/default #Required
--oidc-client-id=00a1g3wxh9x9KLciv4x9 #Required
--oidc-username-prefix=oidc: #Recommended - Makes things a bit neater for RBAC
--oidc-groups-prefix=oidcgroup: #Recommended

Apply these to your api config. If you’re using kops, simply add the changes using kops edit cluster. In my case, I’m using microk8s, so I edit the apiserver configuration located at:

/var/snap/microk8s/current/args/kube-apiserver

and then apply this with:

microk8s stop; microk8s start

Testing Login

At this point we can actually test our sign-in, albeit with limited functionality. To do this, we need to grab an ID token from Okta API using curl or postman. This is a three step process.

1. Establish a session using the Sessions API

curl --location --request POST 'https://dev-987710.okta.com/api/v1/authn' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json'\
--data-raw '{
"username": "kubernetesuser@westall.co",
"password": "supersecurepassword",
"options": {
"multiOptionalFactorEnroll": true,
"warnBeforePasswordExpired": true
} 
}'

2. Exchange your session token for an auth code

curl --location --request GET 'https://dev-987710.okta.com/oauth2/v1/authorize?client_id=00a1g3wxh9x9KLciv4x9&response_type=code&response_mode=form_post&scope=openid%20profile%20offline_access&redirect_uri=http%3A%2F%2Flocalhost%3A8000&state=demo&nonce=b14c6ea9-4975-4dff-9cf6-1b475045dffa&sessionToken=<SESSION TOKEN FROM STEP 1>'

3. Exchange your auth code for an access & id token

curl --location --request POST 'https://dev-987710.okta.com/oauth2/v1/token' \
--header 'Accept: application/json' \
--header 'Authorization: Basic <Base64 Encoded clientId:clientSecret>=' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'grant_type=authorization_code' \
--data-urlencode 'redirect_uri=http://localhost:8000' \
--data-urlencode 'code=<AUTH CODE FROM STEP 2>'

One we have collected a valid ID token, we can run any command against the kubernetes API using the –token & server flag.

kubectl get pods -A --token=<superlongJWT> --server='https://192.168.1.20:16443'

Don’t stress if you get an “Error from server (Forbidden)” prompt back from your request. Kubernetes has a deny by default RBAC design that means nothing will work until a permission is configured.

If you are like me and are also using Microk8s, you should only get this error if you have already enabled the RBAC add on. By default, Microk8s runs with the api server flag: –authorization-mode=AlwaysAllow . This means that any authenticated user should be able to run kubectl commands. If you want to enable fine grained RBAC, the command you need is:

microk8s enable rbac

Applying access control

To make our kubectl commands work, we need to apply a cluster permission. But before I dive into that, I want to point out something that is a bit yucky. Note that for each command, my user is prefixed as configured, but the identifier presented is the users unique Okta profile ID.

While this will work when I assign access, it’s definitely hard to read. To fix this, I’m going to add another flag to my kube-api config.

--oidc-username-claim=preferred_username

Now that we have applied this, you can see that I’m getting a slightly better experience – My users are actually identified by a username! It’s important for later to understand that this claim is not provided by default. In my original authorization request I ask for the “profile” scope in ADDITION to the openid scope.

From here, we can begin to apply manifests against each user for access (using an authorized account). I’ve taken the easy route and assigned a clusterAdmin role here:

kubectl apply -f - <<EOF
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: oidc-admin-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: User
  name: oidc:Okta-Lab@westall.co
EOF

At this point you should be able to test again and get your pods with an OIDC token!

Tidying up the login flow

Now that we have a working kubectl client, I think most people would agree that 3 Curl requests and a really long kubectl command is a bit arduous. One option to simplify this process is to use the native kubectl support for oidc within your kubeconfig.

Personally, I prefer to use the kubectl extension kubelogin. The benefit of using this extension is it simplifies the login process for multiple accounts and your kubeconfig contains arguably less valuable data. To enable kubelogin, first install it;

# Homebrew (macOS and Linux)
brew install int128/kubelogin/kubelogin

# Krew (macOS, Linux, Windows and ARM)
kubectl krew install oidc-login

# Chocolatey (Windows)
choco install kubelogin

Next update your kubeconfig with an OIDC user like so. Note the use of the –oidc-extra-scope flag. Without this Okta will return a token without your preferred_username claim, and signin will fail!

- name: okta
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://dev-987710.okta.com
      - --oidc-client-id=00a1g3wxh9x9KLciv4x9
      - --oidc-client-secret=<okta client secret>
      - --oidc-extra-scope=profile
      - -v1 #Not required, useful to see the process however
      command: kubectl
      env: null

Finally, configure a a new context with your user and run a command. Kubelogin should take care of the rest.

In my opinion, for any solution you deploy, personal or professional. Identity should always be something you properly implement. I’ve been lucky to have most of the K8s magic abstracted away from me (Thanks Azure) and I found this process immensely informative and a useful exercise. Hopefully you will find this post useful in your own identity journey 🙂 Until next time, stay cloudy!

On the value of certification.

Recently I sat two of the new Microsoft Security exams, Security Operations Analyst (SC-200) and Identity and Access Management Administrator (SC-300). These two exams are number 9&10 in my Microsoft certification journey and number 17&18 over the last two years. As beta exams, I won’t receive my results for eight weeks, ample time for me to finally get around to writing a post on the my journey and the overall value I got from it. Buckle up, because I didn’t know I had this many certification thoughts in me!

My certification journey started as I began developing my Azure skills. I had recently moved from being an on-premise focused engineer, dealing with VMWare clusters, SANs, general Windows infrastructure and all the fun stuff you get on premises. Azure was so new to me and I felt like a complete bonehead. Cloud seemed hard and I felt stupid. I’d taken some technology associate exams when I worked as a service desk analyst and stopped my initial foray into study as I dipped my toe into the whole MCSA/MCSE ecosystem. I enrolled initially into the AZ-100 exam, half skeptical that I would drop my studies in a similar manner.

What I ended up finding was an accessible ecosystem, where I was able to develop myself and learn new technology. For that first exam, I spent a three weeks or so preparing – mostly watching Nick Colyer on ACloudGuru (Actually just re-branded skylines academy content). I sat at home and completed the exam quickly scoring an 885. To say I was stoked was an understatement. I quickly booked my AZ-101 a week later, feeling buoyed by a decent result & a shorter set of content. The 101 exam prep was only ~5 hours of video at the time.

The day after booking my AZ-101, Microsoft announced the newer AZ-103 type exam, with the caveat that AZ-100 holders would automatically receive the associate certification. I was a little annoyed initially, but the decision did take the pressure off my 101 exam. I really wished it didn’t, as I really struggled in this one. I remember sitting at my desk on the second question having a mental meltdown, because I had absolutely no idea about some app-service question. I thought I barely passed that exam, however I managed to luck my way into an 894. According to the scoring, I was better at web than I was at systems & network management. This was (and is) wrong. Hooray for bell curves I suppose?

At the end of this process, I had two things on my mind; One, the systems admin (AZ-100) exam was way too easy. I knew I would expect a system administrator to know more about Azure than what I was tested on. This would become a theme for me as I continued to take further certifications. Two, the labs were quite literally a lifesaver and will be for a lot of other people. I can’t speak for my readers, but having used all three cloud providers, I can without a doubt say that the Azure portal is the most user friendly. The lab scenarios would ask me to configure X thing on Y service. The entire time, I knew if I didn’t know how, I could just hover over the various blades of that resource, and have information fed to me about what each option would do. I’m almost certain I passed the 101 exam in this manner.

Following on from my associate success, my supportive employer suggested I move onward to the Azure Architecture focuses. Still feeling like I didn’t know enough Azure, I was quite happy to take the exam support and move up. AZ-300 and 301 followed a similar pattern, albeit with more study videos. This is probably the laziest way to prepare for an exam, because I found I would just watch content and hope you remember it – Labs were time consumign and I could just watch the instructor anyway. I did do my best to apply any learning in my day to day, so I was still getting some hands on, but not in the same manner of old where labbing solutions out was almost mandatory.

The Solution Architect expert exams followed a similar process to my Associate admin exams, with the 301 first (reviews said it was easier) and the 300 second. I passed the 300 comfortably and the 301 barely. At this point, I was pretty well engaged and committed to further exams. This was for a couple of reasons; I enjoyed learning about new things, even though I could not always see an application in my day to day. My employer was happy to pay – I could do nearly any exam for free, provided I passed on the first attempt. Finally, I liked the community recognition it gave me as someone adaptable and willing to learn.

Following the Architect certication, I went on to achieve another 6 Microsoft certifications (hopefully 8 soon), 4 AWS, 1 GCP, 3 Okta and 2 HashiCorp certifications. I’m not afraid to admit, that certain exams were taken only on a request from my employer – Namely HashiCorp Vault Associate & GCP Associate Cloud Engineer. While I’ve used both products in production, they are not something I’m super passionate about. This is the nature of working for a service provider. Sometimes a channel partner requires certification and you will be asked to assist in that process.

Hardest & Easiest Exams?

Of the twenty or so exams I’ve so far taken, the most difficult were Okta certifications, with the easiest being both HashiCorp exams.

The Okta exams were not difficult due to the content covered, but because to their format. Okta uses a format known as Discrete Option Multiple Choice (DOMC). If you read the blurb on DOMC, it’s all about fairness and integrity. Test takers only ever see 50-60% of the total exam, so it becomes a lot harder for dodgy “dumps” websites to steal the questions wholesale. This being said, I found the question format stressful. I was extremely used to using various exam techniques to solve some solutions where I wasn’t comfortable and this was taken away from me. You have to know the content well for DOMC exams. The other problem I found was my anxiety levels were increased, due to uncertainty on future options. DOMC is really good at ensuring you know a platform well, but if you know it too well, expect to feel stressed. You always want to select the best answer, but you don’t know if it will be presented. I distinctly remember stewing on a DOMC question about sign-in data. I knew the prompt of “you can get this x info from the system log” was not the documented/published way of getting this info, but I also knew it was possible to extract system logs and parse them to obtain the relevant data. In a scenario where you don’t know that the “correct” answer will be presented and a wrong selection means all your progress through the question is null & void, this is horribly stressful. You’re punished for knowing more. I also found a woeful lack of third party content available for Okta exams, which would have been quite challenging if I didn’t have employer support.

I think the HashiCorp exams were easier mainly due to the maturity of the program. Currently, you can complete exams on Terraform, Consul and Vault at an “Associate” level. I found the ones I took to be a bit more aligned with foundational concepts and understanding of what the products were, rather than how to use a product in anger. Similar to the Microsoft 900 series. For their price, I’m definitely not upset about the difficulty level, $70 is the most affordable exam I’ve taken. I would be interested to hear the thoughts of other people on this one, because often taking on new technology as a more experienced cloud engineer can be a bit misleading – Your past experience will help you pick up the tech a lot faster (Cloudformation -> ARM Templates -> Terraform is an easy example).

Bumpy ride?

I’ve heard from a few people that their experiences were pretty rough. I have to say, in the scheme of things I feel I’ve gotten off lightly. In lab scenarios, the worst I’ve had to deal with is a wait time for resource deletion (I made a mistake). For everything else, it really depends on the delivery method in question. My AWS & GCP exams were completed in person, an experience which I would not choose over remote proctored. It’s just a hassle for me to get out to a testing center and it’s nowhere near as nice or comfortable as my home office. For remote proctored, I’ve had a couple of testing issues/shoddy check-ins. Nothing that can’t be worked through by chatting to the various providers. My biggest bit of advice in these scenarios is just to stay calm and work through the various support processes.

Value?

For me, the value of certification is just now becoming debatable. I’m a fair bit more experienced than when I started. If I was starting the same certification journey from scratch, I think it would be a hard sell. I wouldn’t have the “I suck at this” mentality that I originally struggled with, or the “Need an entry level job” driver that people starting out have. All up, I’ve spent about $2000 out of pocket on various labs, training material and certifications. If I include what my employers have reimbursed me for, I’m well over the $5000 mark. For this investment, I’m glad to say I’ve had a couple of things happen.

I’ve increased my own skills in the technology I care about.
I’ve developed a network of like minded people around me.
I’ve increased my employment opportunities.

Of these three things, the first is self explanatory; Studying something tends to make you better at it. As a social benefit, I found that as I posted and shared my thoughts on the certification journey, people engaged more freely and even connected with me just to chat on the cloud journey. Showing passion definitely encourages other people to engage with you. Generally I found that I was able to learn from these people too. Some people even reached out regarding the journey that might be right for them – The answer for this question is always, find something that you love. As for the employment opportunities, certifications are a great way to get noticed by recruiters & hiring managers. I receive in-mail on LinkedIn about once every three weeks based on not much else than my certification record – I don‘t think this is a good thing and I don’t like it. What you learn in a book/video/lab is always different from the real world. I’ve met people who have 20+ certifications who I would judge to be technically inept, while I’ve also met people with 0 who are absolute cloud wizards. My advice to anyone looking for a job is to show your passion through blogging, GitHub or community engagement and THEN look at getting certified. Just focusing on certifications will get you noticed by lazy and ignorant recruiting firms and through some HR filters – it won’t get you a job (In my opinion). Passion and experience are way more important here.

Closing Thoughts

If you’re still with me after this rambling 2000 word or so monologue, thank you. Hopefully this post has provided you with some insight into cloud certification. As for me, I plan to continue with my journey, albeit with some exams I think will be the most challenging yet. I’m currently preparing to take the ISC CISSP and the Certified Kubernetes Administrator. I’m still getting value out of this process and I plan to for some time to come. Until next time, stay cloudy!

Azure AD Application Policies Simplified

One of the most common arguments I hear when discussing the move to Azure AD is: “ADFS lets me control everything”. For change adverse organisations, this can be a legitimate problem. More often than not however, the challenge is not that Azure AD cannot be customised to the organisational need. Instead, it is that operators don’t understand how to customise Azure AD. When considering ADFS, the following areas are commonly updated to match business requirements

Branding
Claims Policy
Home Realm Discovery
Token Lifespans

Branding is a pretty common requirement and can be modified in two ways, depending if you’re focused on business or consumer identity. Claims Policy, HRD and Token lifespans are all a bit more confusing, with policy for these being the topic of todays post.

Policy Types

If you pop the hood on Azure AD using Graph, you will discover quickly that application policies are derived from the “stsPolicy” resource. This ensures that nearly every policy follows a standard format, with the key difference occurring within the definition element. Generally speaking, If you’ve written one policy type, you can write them all. Application Policies can be applied against both the Application and the application Service Principal, meaning rather than the two types that are immediately indicated in the Application documentation, we actually have five types. If you’re not aware of how Azure AD Applications and Service Principals work together, Microsoft provides a good summary here.

Policy Type	Usage Scenario	ADFS Equivalent
HomeRealmDiscovery	“Fast Forwarding” directly from Azure AD to a branded sign-in page or external IDP. Useful in migration scenarios.	Home Realm Discovery
ClaimsMappingPolicy	Mapping data that is not supported by “Optional Claims” into SAML, ID and Access tokens.	Claim Rules
PermissionGrantPolicy	Bypass admin approval flows when users request specific permissions. EG Graph/User.Read	N/A
TokenIssuancePolicy	Update Characteristics of SAML tokens – Things like token signing or SAML Version.	WS-Fed and custom certificates
TokenLifetimePolocy	Extend or modify how long SAML, or ID tokens are valid for.	Relying Party Token Lifetimes

Unfortunately documentation on application policies is currently a little light on content, and there is a few important details you must understand when applying them;

As of writing, some policy types are in preview, meaning that Microsoft reserves the right to change how they work.
ClaimsMappingPolicies require you to set the “acceptMappedClaims” value to true within the application manifest OR configure a custom signing key.
TokenLifeTimePolicy works only for ID and Access tokens as of January 31st 2021. Refresh and session tokens have moved to Conditional Access session control.

Reading Policy Objects

Thankfully the current specifications for policy objects are quite simple. In the below example we declare a ClaimsMappingPolicy which maps employeeid data from the Azure AD User through to SAML and ID Tokens.

{
    "ClaimsMappingPolicy": {
        "Version": 1,
        "IncludeBasicClaimSet": "true",
        "ClaimsSchema": [
            {
                "Source": "user",
                "ID": "employeeid",
                "SamlClaimType": "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/employeeid",
                "JwtClaimType": "employeeid"
            }
        ]
    }
}

One principal to apply when building policies is to ensure they remain granular. This makes the effect of a policy clear and also enables you to assign one policy to many applications.

Applying Policy

Applying a policy to an application is currently not supported within the Azure AD portal, requiring you to use PowerShell and the AzureADPreview module. This is a pretty simple five step process.

1. Import the AzureADPreview Module and sign in to Azure
2. Create your application, either in the portal or using PowerShell
3. Create your application policy using PowerShell

#Create Policy Object
New-AzureADPolicy -Definition @('{"ClaimsMappingPolicy":{"Version":1,"IncludeBasicClaimSet":"true", "ClaimsSchema": [{"Source": "user","ID":"employeeid","SamlClaimType":"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/employeeid","JwtClaimType":"employeeid"}]}}oli') -DisplayName EmitEmployeeIdClaim -Type ClaimsMappingPolicy

4. Assign your policy to your application

#Apply Policy to targeted application.
Add-AzureADServicePrincipalPolicy -Id <ServicePrincipalOBJECTId> -RefObjectId <PolicyId>

5. Validate your policy assignment

Get-AzureADServicePrincipalPolicy -Id <ServicePrincipalOBJECTId>

Hopefully you have found this post informative, with a few of your policy options de-mystified. As always, feel free to each out if you have any questions regarding your own Identity and Access Management scenarios.

Effortless sync for Azure AD B2B users within AD Connect

Recently I have been working on a few identity projects where Azure AD B2B users have been a focus point. The majority of organisations have always had a solution or process for onboarding contractors and partners. More often then not, this is simply “Create an AD Account” and call it a day. But what about Azure AD? How do organisations enable trusted parties, without paying for it?

Using native “cloud only” B2B accounts lets organisations onboard contractors seamlessly, but what about scenarios where you want to control password policy? Or grant access to on-premise integrated solutions? In these scenarios, retaining the on-premise process can be a hard requirement. Most importantly, we need to solve all these questions without changes to existing business process

Thankfully, Microsoft has developed support for UserTypes within AD Connect. Using this functionality, administrators can configure inbound and outbound synchronisation within AD Connect, with the end result being on-premise AD mastered, guest accounts within Azure AD.

The Microsoft Process

Enabling this synchronisation according to the Microsoft documentation is a pretty straight-forward task;

Disable synchronisation – You should complete this before carrying out any work on AD connect
Designate and populate an attribute which will identify your partner accounts. “ExtensionAttributes” within AD are a prime target here.
Using the AD Connect Sync manager, ensure that you are importing your selected attribute.
Using the AD Connect Sync Manager, enable “userType” within the Azure AD schema

Add source attribute to Azure AD Connector schema — Enabling UserType within the AAD Schema

5. Create an import rule within the AD Connect rules editor, targeting your designated attribute. Use an expression rule like so to ensure the correct value is applied.

IIF(IsPresent([userPrincipalName]),IIF(CBool(InStr(LCase([userPrincipalName]),"@partners.fabrikam123.org")=0),"Member","Guest"),Error("UserPrincipalName is not present to determine UserType"))

6. Create an export rule moving your new attribute from the metaverse through to Azure AD
7. Enable synchronisation and validate your results.

A Better Way to mark B2B accounts

While the above method will most definitely work, it has a couple of drawbacks. Firstly, it relies on data entry. If the designated attribute is not set correctly, your users will not update. If you haven’t already got this data, you also need to apply it. More work. Secondly, this process can be achieved through a single sync rule and basic directory management. Less locations for our configuration to break.

To apply this simpler configuration, you still complete Steps 1 and 4 from above. Next, you ensure that your users are properly organised into OU’s. For this example, I’m using a “Standard” and “Partner” OU structure.

Finally, you create a single rule outbound from the AD Connect metaverse to Azure AD. As with most outbound rules, ensure you have an appropriate scope. In the below example we want all users, who are NOT mastered by Azure AD.

The critical part of your rule is the transformations. Because DistinguishedName (CN + OU) is imported to AD Connect by default, our rule can quickly filter on the OU which holds our users.

IIF(IsPresent([distinguishedName]),IIF(CBool(InStr(LCase([distinguishedName]),"ou=users - partners,dc=ad,dc=westall,dc=co")=0),"Member","Guest"),Error("distinguishedName is not present to determine UserType"))

And just like that, we have Azure AD Accounts, automatically marked as Guest Users!

Balon Greyjoy
Barristan Selmy
Benjen Stark
Beric Dondarr.,.
Bran Stark
Brienne Of Tar...
Brynden Tully
BaIon.Greyjoy@lab.westall.co
Barristan.selmy@lab.westall.co
Benjen.stark@lab.westall.co
Beric.Dondarrion@lab.westall.co
Bran.Stark@lab.westall.co
Brienneof.Tarth@lab.westall.co
Brynden.Tully@lab.westall.co
Guest
Guest
Member
Guest
Member
Guest
Guest — B2B and Member accounts copied from AD

Empowered Multi Cloud: Azure Arc and Kubernetes

At Arinco, we love Kubernetes, and in this post I’ll be covering the basics of configuring Azure Arc on Kubernetes. As a preview feature, this integration enables Azure administrators to connect to remote Kubernetes clusters, manage deployments, policy and monitoring data, without leaving the Azure Portal. If you’re experienced with Google Cloud, this functionality is remarkably similar to Google Anthos, with the main difference being that Anthos only focuses on Kubernetes, whereas Arc will quite happily manage Servers, SQL and Data platforms as well.

Before we begin, there is a couple of key facts that you need to be aware of while Arc for Kubernetes is in preview:

Currently only East US and West Europe deployments are supported.
Only x64 based clusters will work at this time and no manifests are published for you to recompile software on other architectures.
Testing of supported clusters is still in early days. Microsoft doesn’t recommend the Arc enabled Kubernetes solution for production workloads

Enabling Azure Arc

Assuming that you already have a cluster that will be supported, configuring a connected Kubernetes instance is a monumentally simple task Two steps to be exact.

1. Enable the preview azure cli extensions

1	`az extension add --name connectedk8s` `az extension add --name k8sconfiguration`

2. Run the CLI commands to enable an ARC enabled cluster

1	`az connectedk8s connect --name GKE-KUBERNETES-LAB --resource-group KUBERNETESARC-RG01`

Under the hood, Azure CLI completes the following when we execute the above command:

Creates an ARM Resource for your cluster, generating the relevant connections and secrets.
Connects to your currently cluster context (see kubeconfig) and creates a deployment using Helm. ConfigMaps are provided with details for connecting to Azure, with resources being published into an azure-arc namespace
Monitors this deployment to completion. For failing clusters, expect to be notified of failure after approximately 5-10 minutes.

If you would like to watch the deployment, it generally takes around 30 seconds for an Arc namespace to show up and from there you can watch as Azure Arc related pods are scheduled.

So what can we do?

Once a cluster is on-boarded to Arc, there is actually quite a bit you can do in preview, including monitor. The most important in my opinion is simplified method to control clusters via the GitOps model. If you were paying attention during deployment, you will have noticed that Flux is used to deliver this functionality. Expect further updates here, as Microsoft has publicly committed recently to further developing a standardised GitOps model.

Using this configurations model is quite simple, and to be perfectly honest, you don’t even need to understand exactly how Flux works. First, commit your Kubernetes manifests to a public repository, don’t stress too much about order or structure. Flux is basically magic here and can figure everything out. Next add a configuration to your cluster and go grab a coffee.

For my cluster, I’ve used the Microsoft demo repository. Simply fork this and you can watch the pods create as you update your manifests.

Closing Thoughts

There is a lot of reasons to run your own cluster, or a cluster in another cloud. Generally speaking, if you’re currently considering Azure Arc you will be pretty comfortable with the Kubernetes ecosystem as a whole.

Arc enabled clusters will just be another tool you could add, and you should use same consideration that you apply for every other service you consider utilising. In my opinion the biggest benefit of the service is simplified and centralized management capability across multiple clusters. This allows me to manage my own AKS clusters and AWS/GCP clusters with centralized policy enforcement, RBAC and monitoring. I would probably look to implement Arc if I was running a datacenter cluster, and definitely if I was looking to migrate to AKS in the future. If you are looking to get test out Arc for yourself, I would definitely recommend the Azure Arc Jumpstart.
Until next time, stay cloudy!

Originally posted at arinco.com.au

Empowered Multi Cloud: Onboarding IaaS to Azure Arc

More often than not, organisations move to the cloud on a one way path. This can be a challenging process with a large amount of learning, growth and understanding required. But why does it all have to be in one direction? What about modernising by bringing the cloud to you? One of the ways that organisations can begin this process when moving to Azure is by leveraging Azure Arc, a provider agnostic toolchain that supports integration of IaaS, Data services and Kubernetes to the Azure Control Plane.

Azure Arc management control plane diagram — Azure Arc Architecture

Using Arc, technology teams are enabled to use multiple powerful Azure tools in an on-premise environment. This includes;

Azure Policy and guest extensions
Azure Monitor
Azure VM Extensions
Azure Security Centre
Azure Automation including Update Management, Change Tracking and Inventory.

Most importantly, the Arc pricing model is my favourite type of pricing model: FREE! Arc focuses on connecting to Azure and providing visibility, with some extra cost required as you consume secondary services like Azure Security Centre.

Onboarding servers to Azure Arc

Onboarding servers to Arc is a relatively straight forward task and is supported in a few different ways. If you’re working on a small number of servers, onboarding using the Azure portal is a manageable task. However, if you’re running at scale, you probably want to look at an automated deployment using tools like the VMWare CLI script or Ansible.

For the onboarding in this blog, I’m going to use the Azure Portal for my servers. First up, ensure you have registered the HybridCompute provider using Azure CLI.

az provider register --namespace 'Microsoft.HybridCompute'

Next, search for Arc in the portal and select add a server. The process here is very much “follow the bouncing ball” and you shouldn’t have too many questions. Data residency is already supported for Australia East, so no concerns there for regulated entities!

Providing basic residency and storage information

When it comes to tagging of Arc servers, Microsoft suggests a few location based tags, with options to include business based also. In a lab scenario like this demo, location is pretty useless, however in real-world scenarios this can be quite useful for identifying what resources exist in each site. Post completion of tagging, you will be provided with a script for the target server. You can use generated script for multiple servers, however, you will need to update any custom tags you may add.

The script execution itself is generally a pretty quick process, with the end result being a provisioned resource in Azure and the Connected Machine Agent on your device.

So what can we do?

Now that you’ve completed onboarding you’re probably wondering what next? I’m a big fan of the Azure Monitoring platform (death to SCOM), so for me this will always be a Log Analytics onboarding task, closely followed by Security Centre. One of the key benefits with Azure Arc is the simplicity of everything, so you should find onboarding any Arc supported solution to be a straight forward process. For Log Analytics navigate to insights, select your analytics workspace, enable and you’re done!

What logs you collect is entirely on your logging collection strategy with Microsoft providing further detail on that process here. In my opinion, the performance data being located in a single location is worth it’s weight in gold.

If you have already connected Security Centre to your workspace, onboarding to Log Analytics often also connects your device to Security centre, enabling detailed monitoring and vulnerability management.

Domain controller automatically enabled for Security Centre

Right for you?

While the cloud enables organisations to move quickly, sometimes moving slowly is just what the doctor ordered. Azure Arc is definitely a great platform for organisations looking to begin using Azure services and most importantly, bring Azure into their data centre. If you’re wanting to learn more about Arc, Microsoft has published an excellent set of quick-starts here and the documentation is also pretty comprehensive. Stay tuned for our next post, where we explore using Azure Arc with Kubernetes. Until next time, stay cloudy!

Managing Container Lifecycle with Azure Container Registry Tasks

Recently I’ve been spending a bit of time working with a few customers, onboarding them to Azure Kubernetes Service. This is generally a pretty straight forward process; Build Cluster, Configure ACR, Setup CI/CD.

During the CI/CD buildout with one customer, we noticed pretty quickly that our cheap and easy basic ACR was filling up rather quickly. Mostly with development containers which were used once or twice and then never again.

In my opinion the build rate of this repository wasn’t too bad. We pushed to development and testing 48 times over a one week period, with these incremental changes flowing through to production pretty reliably on our weekly schedule.

That being said, the growth trajectory put our development ACR filling up in about 3-4 months. Sure we could simply upgrade the ACR to a standard or premium tier, but at what cost? A 4x price increase between basic and standard SKU’s, and even steeper 9x to premium. Thankfully, we can solve for this in few ways.

Manage our container size – Start from scratch or a container specific OS like alpine.
Build containers less frequently – We have almost a 50:1 development to production ratio, so there is definitely a bit of wiggle room there.
Manage the registry contents, deleting old or untagged images.

Combining these options will provides our team with a long term and scalable solution. But how can we implement item number 3?

ACR Purge & Automatic Cleanup

As a preview feature, Azure Container Registry now supports filter based cleanup of images and containers. This can be completed as an ad-hoc process or as a scheduled task. To get things right, I’ll first build an ACR command that deletes tagged images.

# Environment variable for container command line
PURGE_CMD="acr purge \
  --filter 'container/myimage:dev-.*' \
  --ago 3d --dry-run"

az acr run \
  --cmd "$PURGE_CMD" \
  --registry mycontainerregistry \
  /dev/null

I’ve set an agreed upon container age for my containers and I’m quite selective of which containers I purge. The above dry-run only selects the development “myimage” container and gives me a nice example of what my task would actually do.

Including multiple filters in purge commands is supported. So, feel free to build expansive query sets. Once you are happy with the dry run output, it’s time to setup an automatic job. ACR uses standard cronjob syntax for scheduling, so this should be a pretty familiar experience for linux administrators.

PURGE_CMD="acr purge \
  --filter 'container/my-api:dev-.*' \
  --filter 'container/my-db:dev-.*' \
  --ago 3d"

az acr task create --name old-container-purge \
  --cmd "$PURGE_CMD" \
  --schedule "0 2 * * *" \
  --registry mycontainerregistry \
  --timeout 3600 \
  --context /dev/null

And just like that, we have a task which will clean up our registry daily at 2am.

As an ARM template please?

If you’re operating or deploying multiple container registries for various teams, you might want to standardise this type of task across the board. As such, integrating this into your ARM templates would be mighty useful.

Microsoft provides the “Microsoft.ContainerRegistry/registries/tasks” resource type for deploying these actions at scale. There is, however, a slightly irritating quirk. Your ACR command must be base64 encoded YAML following the tasks specification neatly documented here. I’m not sure about our readers, but generally combining Base64, YAML and JSON leaves a nasty taste in my mouth!

{
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "containerRegistryName": {
            "type": "String",
            "metadata": {
                "description": "Name of the ACR to deploy task resource."
            }
        },
        "containerRegistryTaskName" : {
            "defaultValue": "old-container-purge",
            "type": "String",
            "metadata": {
                "description": "Name for the ACR Task resource."
            }
        },
        "taskContent" : {
            "defaultValue": "dmVyc2lvbjogdjEuMS4wCnN0ZXBzOiAKICAtIGNtZDogYWNyIHB1cmdlIC0tZmlsdGVyICdjb250YWluZXIvbXktYXBpOmRldi0uKicgLS1maWx0ZXIgJ2NvbnRhaW5lci9teS1kYjpkZXYtLionIC0tYWdvIDNkIgogICAgZGlzYWJsZVdvcmtpbmdEaXJlY3RvcnlPdmVycmlkZTogdHJ1ZQogICAgdGltZW91dDogMzYwMA==",
            "type": "String",
            "metadata": {
                "description": "Base64 Encoded YAML for the ACR Task."
            }
        },
        "taskSchedule"  : {
            "defaultValue": "0 2 * * *",
            "type": "String",
            "metadata": {
                "description": "CRON Schedule for the ACR Task resource."
            }
        },
        "location": {
            "type": "string",
            "defaultValue": "[resourceGroup().location]",
            "metadata": {
                "description": "Location to deploy the ACR Task resource."
            }
        }
    },
    "functions": [],
    "variables": {},
    "resources": [
        {
            "type": "Microsoft.ContainerRegistry/registries/tasks",
            "name": "[concat(parameters('containerregistryName'), '/', parameters('containerRegistryTaskName'))]",
            "apiVersion": "2019-06-01-preview",
            "location": "[parameters('location')]",
            "properties": {
                "platform": {
                    "os": "linux",
                    "architecture": "amd64"
                },
                "agentConfiguration": {
                    "cpu": 2
                },
                "timeout": 3600,
                "step": {
                    "type": "EncodedTask",
                    "encodedTaskContent": "[parameters('taskContent')]",
                    "values": []
                },
                "trigger": {
                    "timerTriggers": [
                        {
                            "schedule": "[parameters('taskSchedule')]",
                            "status": "Enabled",
                            "name": "t1"
                        }
                    ],
                    "baseImageTrigger": {
                        "baseImageTriggerType": "Runtime",
                        "status": "Enabled",
                        "name": "defaultBaseimageTriggerName"
                    }
                }
            }
        }
    ],
    "outputs": {}
}

The above encoded base64 translates to the following YAML. Note that it includes the required command and some details about the execution timeout limit. For actions that purge a large amount of containers, Microsoft advises you might need to increase this limit beyond the default 3600 seconds (1 Hour).

version: v1.1.0
steps: 
  - cmd: acr purge --filter 'container/my-api:dev-.*' --filter 'container/my-db:dev-.*' --ago 3d"
    disableWorkingDirectoryOverride: true
    timeout: 3600

Summary

Hopefully, you have found this blog post informative and useful. There are a number of scenarios for this feature-set; deleting untagged images, cleaning up badly named containers or even building new containers from scratch. I’m definitely excited to see this feature move to general availability. As always, please feel free to reach out if you would like to know more. Until next time!

Attempting to use Azure ARC on an RPi Kubernetes cluster

Recently I’ve been spending a fair bit of effort working on Azure Kubernetes Service. I don’t think it really needs repeating, but AKS is an absolutely phenomenal product. You get all the excellence of the K8s platform, with a huge percentage of the overhead managed by Microsoft. I’m obviously biased as I spend most of my time on Azure, but I definitely find it easier than GKE & EKS. The main problem I have with AKS is cost. Not for production workloads or business operations, but for lab scenarios where I just want to test my manifests, helm charts or whatever. There’s definitely a lot of options for spinning up clusters on demand for lab scenarios or even reducing cost of an always present cluster; Terraform, Kind or even just right sizing/power management. I could definitely find a solution that fits within my current Azure budget. Never being one to take the easy option, I’ve taken a slightly different approach for my lab needs. A two node (soon to be four) Raspberry Pi Kubernetes cluster.

Besides just being cool, It’s great to have a permanent cluster available for personal projects, with the added bonus that my Azure credit is saved for more deserving work!

That’s all good and well I hear you saying, but I needed this cluster to lab AKS scenarios right? Microsoft has been slowly working to integrate “non AKS” Kubernetes into Azure in the form of ARC enabled clusters – Think of this almost as an Azure compete to Google Anthos, but with so much more. The reason? Arc doesn’t just cover the K8s platform and it brings a whole host of Azure capability right onto the cluster.

The setup

Configuring a connected ARC cluster is a monumentally simple task for clusters which meet muster. Two steps to be exact.

1. Enable the preview azure cli extensions

az extension add --name connectedk8s
az extension add --name k8sconfiguration

2. Run the CLI commands to enable an ARC enabled cluster

az connectedk8s connect --name RPI-KUBENETES-LAB --resource-group KUBERNETESARC-RG01

In the case of my Raspberry Pi cluster – arm64 architecture really doesn’t cut it. Shortly after you run your commands you will receive a timeout and discover pods stuck in a pending state.

Digging into the deployments, it quickly becomes obvious that an amd64 architecture is really needed to make this work. Pods are scheduled across the board with a node selector. Removing this causes a whole host of issues related to what looks like both container compilation & software architecture. For now it looks like I might be stuck with a tantalising object in Azure & a local cluster for testing. I’ve become a victim of my own difficult tendencies!

Right for you?

There is a lot of reasons to run your own cluster – Generally speaking, if you’re doing so you will be pretty comfortable with the Kubernetes ecosystem as a whole. This will just be “another tool” you could add, and you should apply the same consideration for every other service you consider using. In my opinion the biggest benefit of the service is the simplified/centralised management plane across multiple clusters. This allows me to manage my own (albeit short lived) AKS clusters and my desk cluster with centralised policy enforcement, RBAC & monitoring. I would probably look to implement if I was running my datacenter cluster, and definitely if I was looking to migrate to AKS in the future. If you are considering, keep in mind a few caveats;

The Arc Service is still in preview – expect a few bumps as the service grows
Currently only available in EastUS & WestEurope – You might be stuck for now if operating under data residency requirements.

At this point in time, I’ll content myself with local cluster. Perhaps I’ll publish a future blog post if I manage to work through all these architecture issues. Until next time, stay cloudy!