Tech in the 603, The Granite State Hacker

The Cognitive Content Crisis

As many folks in my community may already be aware, I’ve been building chatbots with my team, using the Microsoft Bot Framework, a lot lately. In doing so, we’ve encountered a common issue across multiple clients.

Cognitive Content Crisis

While many peolpe are worrying about lofty issues around artificial intelligence like security, privacy, and ethics (all worthy to be sure), I’m considering something more pragmatic here. Folks go into a cognitive agent build without considering content, how it relates to AI and AI development, and how to manage it. While some of my clients with more mature projects have taken a crack at resolving this issue with custom solutions, these custom solutions are often resource intensive, fail to consider all the business requriements, and end up becoming an unnecessary bottleneck to further development. Worse, waiting till a project phase-2 or phase-3 to address it compounds the trouble.

Sadly, there’s often an enterprise content management system (ECMS) in place that could be used, instead, right from pre phase-1. With a reasonable effort, a well-featured existing ECMS can be customized along side your build out, saving a massive effort later.

The Backstory

If you check out that Microsoft Bot Framework website, one of the first things you’ll notice is that building conversational agents is a process that cuts across a number of development disciplines… and the first one that typically gets highlighted is Artificial Intelligence.

Artificial Intelligence around conversational agents could include anything from visual identification & classification to moderation, sentiment analysis, and advanced search, but it predominantly revolves around language tools… especially LUIS, QnA Maker, Azure Search, and others.

At this point, it helps to think about what Artificial Intelligence is. Artificial Intelligince is about experience. In a conversational artificial intelligence, that content is human readable, social, and web like. Experience is content…. Conversational content.

In fact, it’s almost web like. A user typically opens a chat window (which correlates a bit to a browser) and types an utterance (query). The bot catches such utterances, and depending on a number of factors of origination, data state / context, identity, and authorizations, generally produces a text based response.

Getting More Specific

In the case of a bot designed to coach folks with a chronic disease, for example, a user might ask a bot “Can I eat chocolate cake?”. The bot gets this query, and parses it into language elements… which looks something like “can I eat” as an ‘intent’, and “chocolate cake” as an entity. The bot then brings in a rules-set described by conditions it knows about the user (what disease(s) are being managed), and what the bot knows about the users current state (perhaps the blood glucose level if they’re diabetic). Based on the conditions against the rules, the response must be produced. If you have a sophistocated bot, you might have a per-entity response… Take a response like “OH, chocolate cake is wonderful, but your blood sugar level is a bit high right now. Unless you can find something in a low-carb, sugar free variety, I wouldn’t recommend it, but here’s a recipe you might try instead.” That content (including the suggested recipe) must be authored by subject matter experts, moderated by peers, approved (potentially by regulatory and maybe even legal teams), tagged to match the rules engine expectations… much like web content.

Also note, the rules engine itself is also content in a sense. In order to let subject matter experts have a say in tweaking and tuning responses, (what’s a high blood glucose level? What’s too much sugar for a high blood glucose level, et al.) These rules should be expressed as content a subject matter expert could understand and update.

Another common scenario we’re seeing is HR content. Imagine you’ve got a company that produced an HR handbook every year. Well, actually, you’re a conglomerate that has a couple dozen handbooks, and each employee needs answers specific to the one for their division… Not only do you have to tag the contet by year, but by division, and even problem domain. (Imagine trying to answer the question “what is my deductible?” It’s easy enough for a bot to understand that this relates to insurance provided through benefits. The answer might be different depending on whether you mean the PPO or the PMO medical plan… or is that a dental plan question? What about vision? They probably mean this year. Depending on the division they’re a part of (probably indicated by a claim in their authorization token), they might have different providers, as well.

Back to Development

In the development world, not only do you have the problem domain complexities present, but you also have different environments to push content to… the Dev environment is the sandbox a coder works in actively, and it’s only as stable as the developer’s last compile. Then there might be environments named things like DIT, SIT, UAT, QA, and PROD. To do things right, you should update content in each of these environments discretely… updating content in QA should not affect content in SIT, UAT, or PROD.

Information Architecture


Information architecture (IA) is the structural design of shared information environments; the art and science of organizing and labelling websites, intranets, online communities and software to support usability and findability; and an emerging community of practice focused on bringing principles of design, architecture and information science to the digital landscape.[1] Typically, it involves a model or concept of information that is used and applied to activities which require explicit details of complex information systems. These activities include library systems and database development.

Wikipedia, Information Architecture

We’ll add artificial intelligence cognitive models and knowledge bases, especially for conversational AI, to that definition. Note that some AI applications need big data solutions. Most ECMS products are not big data solutions.

Enterprise Content Management Systems

There’s a lot of Enterprise Content Management Systems out there, many of which would be suitable for the task of handling the needs of most conversational AI content management systems.

My career path and community involvement causes me to lean toward SharePoint. If you break down the feature set, it makes sense.

  • Ability for SMEs to manage experience data easily without lots of training to understand create/read/update/delete (CRUD) operations
  • Ability to customize content type structures
  • Ability to concurrently manage individual experience data items
  • Ability to globalize the content (to support multiple languages)
  • Ability to customize workflows (think SME review approval, regulatory, even legal approval) on a per-experience item basis
  • Ability to mark up each experience item with additional metadata both for cognitive processing purposes and for deployment purposes
  • All of this content is then exposed to REST services, so you get the ability to integrate automation to bridge the content into the cognitive models

It’s often said that if you design your data structures properly, the rest of your application will practically build itself. This is no exception. While you will have to build your own automation to bridge the gap between your CMS and your cognitive model environments, you’ll be able to do this easily using REST services.

While you may need to come up with your own granularity, you’ll probably find some clear hits, especially in the area of QnA Maker… every Question / Answer experience pair probably fits nicely as a single content entity. You’ll probably have to add metadata to support QnA maker’s filtering, and the like.

Likewise with LUIS, you may find that each Intent and the related utterances is a single content entity. LUIS, being more sophistocated, will also need related entities and synonyms modeled in content data.

I’ve seen other CMS system used. Most notably CosmosDB and Contentful. Another choice might be some kind of data mart. All of these cases require a heavy investment in building out a UI layer for your SMEs. SharePoint takes care of the bulk of that part for you.

Got a project you want to start working on? Don’t forget to account for content management early on. As always, reach out to me if you need advice on this or any other aspect of building out a solution involving technologies like these… Connect on Twitter, Linkedin or the like…

Tech in the 603, The Granite State Hacker

Fix: Drag and Drop File Upload on Published SharePoint 2013 Pages

Ran into a long existing bug in SharePoint 2013 where you have a couple of conditions prevent uploading a file to a document library by drag & drop.

The conditions are these:

The document library is exposed as a webpart view on the page.
The page is a published page (as opposed to draft or checked-out) in a site where publishing is enabled at the site collection level.   (note that drag and drop uploads work if the page is not published.)

A number of posts describe the problem out on the interwebs, but I couldn’t find a single one with a working solution… 

They talked about dragdrop.js and the SP.Utilities.CommandBlock undefined error, and setting x-ua-compatible to IE9, and a few other pieces to the puzzle… 

After a consult with the gang around the BlueMetal office, we collectively arrived at the following solution, using Script On Demand functionality…

The fix is to add the following to a script block at the bottom of the master page associated with the published page:

  



     //Drag & Drop fix for publishing pages     
SP.SOD.executeFunc("sp.core.js");       
SP.SOD.executeFunc("cui.js");


That forces those libraries to load for your masterpage even if nothing else on the page requires them.

Tech in the 603, The Granite State Hacker

Apache Cordova and SharePoint Online / Office 365

The concept came from a good place, but at this point, the story is best described as “science experiment”, as I mentioned at SharePoint Saturday Boston 2015.  I was working on a cross-platform Apache Cordova project for Windows, Windows Phone and Android when the call for speakers hit.  I said “why not?” and I signed myself up to present it…

The good news is that the story’s not without some worth to someone exploring the idea of hooking into SharePoint from an Apache Cordova-based app. Tools that exist today at least assist in the process.

The demo code is mostly about accessing files from your personal SharePoint profile document library (A.K.A. OneDrive for business) and indeed, the code is using file access code in addition to SharePoint connection.  The hardest work in a browser based app is to authenticate with Office 365, and this code does that, and then opens up to the rest of SharePoint…

Tech in the 603, The Granite State Hacker

AppStudio gotcha

Recently, I upgraded the Granite State (NH) SharePoint Users Group’s website from WSS 3 (MOSS 2007 generation) to SharePoint Foundation 2013.  The upgrade itself went as well as a 2007 to 2010 to 2013 upgrade could go, in general.

The only real “problem” I ran into was the Windows Phone app I wrote for the group years ago.  It was coming up with a 401 error trying to grab content from lists.asmx.  

I spent some time digging in the dirt, trying to resolve the 401, and hit a few common settings known to have an impact, but no good.  

Rather than struggle with it in my not so copious amounts of spare time, I decided to trash the old app, and build a new one with AppStudio.  

The app loads content from the #NHSPUG web site (http://granitestatesharepoint.org), mostly via RSS feeds.  I put a little extra effort into this.  Using AppStudio (http://appstudio.windows.com), I found a couple hours…  after that, I had not only a much prettier v3 of the Windows Phone app, but a Windows 8.1 (tablet style) publishing package as well.

One thing that caught me off guard though… the Gotcha:

The Windows 8.1 edition of the app wouldn’t load the content from the users group website. 

With some debugging, I found that attempts to load the content were coming up with “Unable to connect to the remote server. hresult=   -2146233088”.

Turns out the error had to do with the fact that I had not enabled Capability “Private Networks (Client & Server” in the Package.appxmanifest.   Ironically, the app works fine anywhere except where I was trying to test it:  on the same network as the content source server. So, to be fair, this is an environmental/configuration issue, not AppStudio, but it was worth mentioning, since my original assumption led me down that path. Maybe this will help someone else.

Oh… Here’s the Windows Phone app:
http://www.windowsphone.com/s?appid=8c1ce3ea-9ffd-46a0-80bd-6b45d1019b32

And here’s the Windows 8.1 (tablet style) app:
http://apps.microsoft.com/windows/app/granite-state-sharepoint-users/01ea0a83-f3af-4be6-abb0-268587072686

And here’s my moment of shame recording the incident and solution in the forums:
https://social.msdn.microsoft.com/Forums/windowsapps/en-US/be7b02cf-25d0-4aa2-8850-e0e2dce21fd2/appstudio-windows-81-apps-not-loading-external-content?forum=wpappstudio&prof=required

Tech in the 603, The Granite State Hacker

Getting Display Names from User Names in a hostile SharePoint environment

I recently ran into a nasty situation where I needed a reliable way to get a list of user Full Names (or Display Names) from a list of usernames in a SharePoint process.

The short answer was easy…  The code runs server side so…

SPUser theUser = web.EnsureUser(username);
string DisplayName = theUser.Name;

//Right?

Well, under normal circumstances, sure.  

In this circumstance, I was checking a list of lists of user names, a condition where I might need to check hundreds of items, each of which could have a list of users to check.

No biggie, just add a lookup table and cache the results over multiple calls so that I only ever have to look a user up once in my process.

Now here’s the real kicker.  In my target environment, EnsureUser comes back instantly if the username is a valid, active user in Active Directory.  If the user is not a valid user?   The command takes over 40 seconds per call to fail!

My solution was two-fold.  

1)  use the aforementioned cache strategy, which I have in my sample code below as _nameMap.
2)  Use a simple worker thread.  Give it two seconds to succeed.  Kill the thread if it takes longer than that for any reason.

I initially made the mistake of using SPContext.Current.Web in the thread, but that can *sometimes* produce a threading violation.   The code below creates a whole new instance of SPSite/SPWeb on every pass, but that’s a lot safer and better performing than a lot of alternatives.

private Dictionary _nameMap = new Dictionary();  

private string GetUsersWithTempCacheAndTimeoutEnforcement(string rawUsers)
{
string result = string.Empty;
SPContext.Current.Web.AllowUnsafeUpdates = true;
foreach (string aUser in rawUsers.Split(';'))
{
try
{
string addUser = string.Empty;
string checkUser = aUser.Split('#')[1];
if (checkUser.Contains("\\"))
{
lock (_nameMap)
{
if (_nameMap.ContainsKey(checkUser))
{
addUser = _nameMap[checkUser] + "; ";
}
else
{
SPUser userResult = null;
SPContext context = SPContext.Current;
string webUrl = context.Web.Url;

System.Threading.ThreadStart st = new System.Threading.ThreadStart(
() =>
{
try
{
using (SPSite site = new SPSite(webUrl))
{
using (SPWeb web = site.OpenWeb())
{
userResult = web.EnsureUser(checkUser);
}
}
}
catch (Exception)
{ }
});
System.Threading.Thread workThread = new System.Threading.Thread(st);
workThread.Start();
workThread.Join(2000);
if (workThread.IsAlive)
{
workThread.Abort();

}
if (userResult == null)
{
_nameMap[checkUser] = checkUser;
addUser = checkUser + "; ";
}
else
{
_nameMap[checkUser] = userResult.Name;
addUser = userResult.Name + "; ";
}
}
}
}
result += addUser;
}
catch (IndexOutOfRangeException)
{
}
catch (Exception ex)
{
}
}
return result;
}

Tech in the 603, The Granite State Hacker

SharePoint 2013 Distributed Cache error "cachehostinfo is null"

Doing a fresh SharePoint 2013 SP1 deployment, I ran into a couple things I want to remember.

1)  SharePoint 2013 RTM won’t install on Windows Server 2012 R2.  You must install using SharePoint 2013 WITH SP1.

2)  Somehow after configuring things, the distributed cache service wouldn’t run on one of the hosts.  The error was “cachehostinfo is null”.   Advice I got was to remove the service instance and re-add it, but even trying to run “Remove-SPDistributedCacheServiceInstance” came back with that.  The following powershell script allows you to remove the service instance on the machine you’re on, which then frees you up to run add-spdistributedcacheserviceinstance.

Originally from StackExchange:
http://sharepoint.stackexchange.com/questions/58326/sharepoint-2013-distributed-cache-cachehostinfo-is-null-with-remove-spdistrib

$SPFarm = Get-SPFarm
$cacheClusterName = "SPDistributedCacheCluster_" + $SPFarm.Id.ToString()
$cacheClusterManager = [Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterInfoManager]::Local
$cacheClusterInfo = $cacheClusterManager.GetSPDistributedCacheClusterInfo($cacheClusterName);
$instanceName ="SPDistributedCacheService Name=AppFabricCachingService"
$serviceInstance = Get-SPServiceInstance | ? {($_.Service.Tostring()) -eq $instanceName -and ($_.Server.Name) -eq $env:computername}
$serviceInstance.Delete() #You may have to issue the Delete command a couple of times.