Why we care about the SharePoint Cloud Search Service Application

Earlier this month, I was lucky enough to attend Microsoft’s Ignite 2015 conference in Chicago along with a handful of other Content and Code colleagues. When we weren’t eating delicious (but oh-so-cheesy!) deep pan pizza, we were learning about upcoming Microsoft technology changes. Although Ignite focused on Microsoft’s public clouds, a few SharePoint Server 2016 sessions were scattered throughout the conference. We have found that there is still significant demand for on-premises SharePoint expertise, so I made sure to attend those sessions.

Today’s hybrid SharePoint Cloud search challenges

A stand-out session for me introduced the forthcoming Cloud Search Service Application, which looks to be the most significant enhancement to-date for the SharePoint hybrid story. It promises to overcome a major obstacle in the present SharePoint search hybrid story: today, there is no way to “merge” SharePoint Online and on-premises search results. Sure, you can “mash up” results using code (as documented by my colleague Chris O’Brien), but that doesn’t give you a single unified index and associated benefits (a consolidated relevance/ranking model, and one index to maintain).

While the absence of a merged index is probably the most common user-impacting complaint about today’s SharePoint hybrid search model, we have found through design discussions with our larger enterprise clients that maintaining the on-premises index required for hybrid search can be equally troublesome for SharePoint administrators. Firstly, all but the smallest SharePoint 2013 Search Service Applications require four servers, assuming you need high availability. Search servers are resource hungry, and patching them can be an onerous task (even if the correct procedures are followed). Secondly, it is a fact of life that full crawls must be run regularly in an on-premises SharePoint farm – TechNet lists more than a dozen reasons for doing so. Running a full crawl against a large and/or geographically distributed corpus might take days or even weeks. While a full crawl is running, changes to content are unlikely to be reflected in search queries, which will frustrate your users. There is also that nagging feeling that one day, your on-premises search index might need to be reset, rendering solutions that are dependent on search useless for the duration of the subsequent full crawl(s). Wouldn’t it be nice if you could make this someone else’s problem?

Enter the Cloud Search Service Application

The big idea is that Microsoft look after your search index in the cloud, regardless of whether its content was indexed on-premises or in SharePoint Online (SPO). Since there will be one unified index, your users will finally have a single set of results that can be ordered by relevance in an authentic manner.

Prior to Ignite, Microsoft referred to this thing as a “Hybrid Search Crawl Appliance“, so I was slightly surprised to learn that this new capability will be bundled in the Search Service Application. This means that you will still require an on-premises SharePoint Server 2013 or 2016 farm configured to crawl and parse on-premises content, and support hybrid authentication flows. The footprint of a farm designed to host a Cloud Search Service Application is likely to be a few shoe sizes smaller than a farm that might support today’s hybrid search scenarios, as fewer search components are required. To be more specific, the only on-premises search component strictly required by the Cloud Search Service Application is the crawler, and you have the option of leveraging a query processing component if queries must flow “outbound” from an on-premises farm to SPO. Note: all Search Service Application components (including the admin, analytics processing, content processing, and index components) were present within the on-premises SharePoint 2016 search topology during the Ignite demo. However, I understand that only the crawl and query processing components are actually used when a Search Service Application is “cloud enabled”, meaning that only those components would require performance and capacity planning in this context.

A slide from Microsoft Ignite illustrating how search roles are split between on-premises SharePoint and SPO when using the Cloud Search Service Application.

how search roles are split between on-premises SharePoint and SPO

Other than crawling, the jobs performed by other search components (content processing, ACL mapping, index building etc.) are outsourced to SharePoint Online, meaning that you no longer have to maintain an on-premises search index. A copy of parsed content is stored in SPO so that cloud-based search infrastructure changes do not require on-premises content to be re-indexed. It remains to be seen precisely how much easier a cloud-enabled Search Service Application will be to manage, but I expect some of the administration problems I’ve highlighted in this blog will go away. At the very least, I think that index resets will become a thing of the past. The optimist in me thinks that some of the other reasons to run a full crawl in SharePoint 2013 today will also become non-issues (SharePoint Online runs on something that closely resembles a more recent version of whatever binaries are available on-premises, so patching my SharePoint farm shouldn’t require a full crawl. Right?).

A slimmer on-premises SharePoint farm

During the Ignite session, Microsoft gave an example that indicated a company might need 10 on-premises SharePoint 2013 servers to support a “traditional” Search Service Application. A separate slide indicated that as few as 2 servers would support the Cloud Search Service Application:

A slide from Microsoft Ignite indicating that the Cloud Search Service Application may only require 2 SharePoint Server 2013 servers.

2 SharePoint Server 2013 servers

It was unclear whether that example included any servers required for SQL Server, and it raises other questions (will the Distributed Cache Service happily co-exist with the Cloud Search Service Application?), so I’ll wait for more detailed guidance before getting too excited.

Give me more detail!

I know many readers out there will be technical, so I’ve included the bullet point notes I took during the session below. Keep in mind that we got a pretty early view of the capability at Microsoft Ignite, so these details are likely to change between now and General Availability.

The Cloud Search Service Application:

  • Will be shipped in an update to SP2013 later this year (2015), and will be baked into SP2016.
  • Pushes indexed items into a single consolidated SPO index, instead of relying on query federation (the current SP2013 search hybrid model). This means we get a single ranked results set with refiners and search previews. About time!
  • Means that on-premises content shows up in Delve (which uses the SPO search index), albeit without the “rich” thumbnail previews that we get with SPO content
  • Is able to crawl the same content sources that the present SP2013 Search Service application can crawl. This is great news, as it means that content housed in older (2007 or 2010) SharePoint farms can be surfaced in SPO, along with other supported content sources such as file shares
  • Can be consumed by a *SharePoint 2010* farm using today’s SharePoint Service Application Federation model, meaning that SPO content can be queried from a SP2010 farm! As you might expect, there are a few trade-offs and constraints here (e.g. No WAC previews; Web Applications must be in claims mode, on-premises SP2013 Query Processing Component needed). This means that older farms need not be islands of information in a hybrid scenario
  • Can co-exist with other Cloud Search Service Applications to feed a single SPO tenant from multiple locations. I expect that this will be a big deal for some of our globally distributed clients.
  • Strictly respects on-premises permissions. SPO permissions do not “override” on-premises ACLs, even if you are an Office 365 Global Admin.
  • Means that there is no longer a need to have an on-premises search index (save for data residency concerns, see next bullet point).
  • Can co-exist with the current SP2013 search hybrid model (query federation) if some indexed items need to remain on-premises (e.g. data residency reasons)
  • Will be baked into the SharePoint 2016 Search Service Application. This means we still need a “lightweight” farm to house it, complete with a SQL Server instance. I expect the update for SharePoint 2013 will work in the same way, although that wasn’t clarified.
  • Relies on the same foundational hybrid identity management bits that are needed for todays’ SP2013 hybrid solutions (directory synchronisation, OAuth 2.0 trust between SharePoint on-premises and Azure ACS etc.)
  • Relies on an on-premises Office Web Apps farm if Search Previews of on-premises SP2013 or SP2016 content are required (e.g. from within SPO search results pages). This is the only reason that an “inbound” (SPO -> on-premises) hybrid configuration would be required
  • Does not “publish” on-premises content externally. We still need a user-facing publishing capability – such as the Web Application Proxy (WAP) or Azure App Proxy – for secure external publishing of SharePoint and/or Office Web Apps. The alternative is that on-premises content can be searched in SPO, but only accessed and previewed on-premises.
  • Encrypts search metadata before it is sent to SPO in batches
  • Does not support on-premises Site Collection-scoped schema mappings, as those Site Collection objects do not exist in SPO.
  • Introduces a new Managed Property (IsExternalContent) that allows on-premises content to be identified in query rules, result sources, verticals etc.

As you can probably tell, we are pretty excited about getting our hands on this thing! In our experience, search is the most common reason for implementing a hybrid SharePoint infrastructure, and we are pleased to see that Microsoft are addressing the most common pain points about this workload. We will still require an on-premises SharePoint farm to achieve hybrid search, but hopefully it will meet user expectations, result in fewer sleepless nights for SharePoint administrators, and won’t break the bank.

That’s all from me, but you can watch the Ignite session for yourself over on Channel 9: “Implementing Next Generation SharePoint Hybrid Search with the Cloud Search Service Application“.

  Ben Athawes

Ben Athawes

Head of SharePoint Platform

Ben leads Content and Code's SharePoint Platform practice which focuses on the more technical aspects of SharePoint Online, SharePoint on-premises and everything in between. He has been working with SharePoint and related technologies such as SQL Server and AD FS since 2008.

Other Services you may be interested in.

SharePoint Migration with Metalogix – lessons learned

I just wanted to share some lessons learned from a recent SharePoint migration project we have undertaken using both the Metalogix Essentials and Content Matrix tools. In this post I will share some potential pitfalls so that hopefully you can rectify these complications in your own migration scenarios.


I had a client wanting to migrate their WSS 3.0 environment, and some old BPOS sites, in to SharePoint Online. While carrying out the migration of the BPOS sites and preparing for the WSS 3.0 migration, a few issues arose that needed to be rectified so that the migration of both BPOS and WSS 3.0 sites would go as smoothly as possible.

Unfortunately, there was a time lapse between the initial work being carried out in BPOS and the next phase of the migration, however this mainly meant looking at an incremental migration instead of a full migration.

Lessons learned from the SharePoint migration project:

The following lessons were learned from the issues that were experienced during this SharePoint migration:

User mappings

In the Metalogix Essentials tool there can be times when it is intuitive and if the user UPN is in a similar format e.g. firstname.lastname@domain.com then it can map the users for the created/modified by attributes. However, this does not always work and so results in content being labelled with the user that is logged in while doing the migration. A migration service account can be used; however, this needs to exist in both the source and target environments to map correctly.

Essentials uses a CSV file to map old user credentials to new user credentials

This allows for the correct created/modified by name to be used when content is copied. However, if a mapping file is not used when content is originally copied, and then you try to use a delta migration to update the attributes afterwards, this does not work. It sees the file in the target environment as up to date and so does not change it. A full migration, or recopy with metadata using a CSV would be required to change the values. Please note that doing a copy with metadata takes considerably longer to complete.

Getting a list of users from AD is best done using the Microsoft gallery for PowerShell scripts

Although Essentials can be used to export site users from connected environments, the reports are on a Site Collection basis and so to retrieve all users it is better to connect to AD and retrieve the users in a CSV format. Remember, this is only the first part of the process as you need to get the users from Azure AD using an export or PowerShell before being mapped. The display name can normally be used with VLOOKUP formulae in Excel to match the old and new accounts, even if the format needs to be tweaked slightly.

The user mapping file for Essentials cannot be used in Content Matrix as it does not use a CSV for user mappings

This either means manually mapping all user accounts, which is no good in time critical projects, or when clients are being billed; or you can use an XML file instead. After having to switch from using Metalogix Essentials to Content Matrix after the license timed out, I had to find a way to create an XML file.

I wrote a small PowerShell script to create the XML file from a CSV file, and with some assistance from a colleague I completed the script after an issue was rectified. The script creates a “well-formed” XML file from the values that are mapped in the CSV file Metalogix have a great article for further information.

The PowerShell script that I have created is shared below and can be copied in to a file or Windows PowerShell ISE window:

#Import user mappings from existing CSV file - change the path and filename accordingly
$Import = Import-Csv -Path "C:\Users\<User>\Documents\Import_CSV_Mappings.csv"

#Create well-formed XML including the CSV mappings and add to the variable
$xmlData = "<Mappings>"

for($i=0; $i -lt $Import.length; $i++) {
$xmlData += "<Mapping Source='$($Import[$i].Column1)' Target='$($Import[$i].Column2)' />"

$xmlData += "</Mappings>"

#Output XML in variable to a file - change path and filename accordingly
$xmlData | Out-File "C:\Users\<User>\Documents\Export_XML_Mappings.xml"

Note: the above script is shared as an example and Content and Code take no responsibility for any issues that may occur in the running of this script.

Happy migrating!!

About our author

Lee Palmer

Solutions Consultant | Content and Code

Lee is a Solutions Consultant working in the Enterprise Solutions Architects team. Although his primary focus has been SharePoint – previously on-premise deployments with Microsoft, and now SharePoint Online for the past 3 years; he has evolved to help clients build solutions across the products in the Office 365 stack, including Microsoft Teams. Lee has helped a number of clients with migrations from older versions of SharePoint, BPOS, and file shares in to SharePoint Online. He is also accredited with Metalogix and Nintex, who Content and Code work very closely with.


Are you managing your data correctly in SharePoint on-premises?

In an ideal world, information architecture and governance should be in place before any SharePoint sites are launched to end users. However, the reality is this often comes in the other way around or simply doesn’t exist at all. As a result, data management becomes a challenge.

Sometimes, even with the governance guideline in place, you can never guarantee that the end users have the same level of understanding and follow the guidelines correctly.

Managing structured and unstructured data


It’s a totally different story when it comes to managing structured and unstructured data. For a traditional application, managing structured data is mainly about managing databases. But for SharePoint on-premises, apart from managing all the content databases, this also involves managing a number of variables including; SharePoint Sites, Lists, Site Columns, Content Types, TermSets, etc.

Administrators all like structured data because it’s easy to search and manage, however, end users might have a different view as it could mean more restriction and work for them, and this can cause conflict between departments.

We all know the importance of database management, but this is a whole other topic on its own.  Here we are looking at managing data strictly from a SharePoint level and how to turn the unstructured data into the structured data.

Data comes in a variety of different guises


First of all, the data could come in various forms, here is a short list of the common ways:

Files / documents

It’s very easy to upload a file to SharePoint. But is it easy for your end users to find them or navigate to them?


  • Size of files

SharePoint has a default limit of 50MB per file within SharePoint 2013. What do you do if your file exceeds the limit? Would you simply increase the limit or work out a better solution for that?

With SharePoint 2013, you can increase the limit to a maximum of 2GB. But in doing so, remember the limit is for the whole SharePoint site rather than for a specific Document Library. Also, another important factor to consider is; what is the user experience like for accessing a 2GB file from a SharePoint site?


  • Type of files

There is no restriction (apart from the executable files) on the type of files that can be stored in a standard Document Library. However, you might want to think about again where to store images or videos.


  • Contents in the documents

Natively, SharePoint cannot index the contents of a document, which makes it hard for users to find the relevant documents without prior planning.

SharePoint Lists

As content databases are managed by SharePoint and are not supported by Microsoft once they are amended manually, SharePoint lists are the first option to store structured data. Because of that, SharePoint lists are sometimes designed and used as database tables or views without knowing the limitations for example, list view thresholds and the maximum lookup columns supported etc.


Emails / Alerts

Email is probably still the most commonly used form to share information. But it’s hard to track as most of data is in an email body which is unstructured. Governance and training are required to help users understand the benefits of sharing data in other ways. Also, some restrictions can be implemented to restrict users from sharing certain files.


SharePoint Pages

SharePoint Pages are easier to index in SharePoint than the documents. This is because the content of the pages are in plain text and often have the predefined templates. Documents however quite often aren’t in plain text and they are more flexible and rich regarding the content and format.

Articulating an important companywide message?


Email communications to “All Company” might be the first port of call here. But it’s hard to track complicated and unstructured data as mentioned above.  And if you have an Intranet built on SharePoint already, why not utilise that?

What if the announcement contains a fair amount of data?


You might not want your particular announcement to take up valuable real estate on your intranet home page. Normally, a link can be provided for details instead. But the question now is whether the detailed information of the announcement warrants an entire page or the creation of a single document? This is largely dependent on how well your SharePoint intranet site was built and how well your SharePoint search is optimised.

Generally, it’s more difficult to index the content of a document than the content of a page. So, creating a page for the announcement sounds like a better option. But do you know if your users can still find the detailed information (e.g. a new policy) once the announcement is removed from your home page?

Let’s take this up a notch


If the details of the announcement consist of a few different training materials there are a few options available to you to ensure that you are managing data in you SharePoint environment correctly.

Pages Library

Depending on the types of content you could be better off creating multiple pages in a shared SharePoint Pages library. It is fairly easy to create a page in SharePoint if the predefined SharePoint layouts meet the requirements of your announcement.

Document Libraries

Again, depending on the types of training materials, these could be created in documents like PDF, Words or PowerPoints, then saved to a document library. Apart from rich content requirement, if the materials need to be accessible offline, SharePoint pages would not be a good idea in this instance.

Document Sets

If your materials are available in smaller batches for end-users to consume, creating document sets could be a good option. Document Sets however are only available in SharePoint 2013, although would work well to group a small set of documents together.

What about restricting access to sensitive documents?


In many cases it’s important to consider that some business critical and sensitive materials should only be accessible by certain groups of staff or departments within the business. You must now determine the best approach to ensuring that your data is effectively managed within your SharePoint environment.

Single document library

Should the materials be created in the folders with unique permissions within the same Document Library.  Although technically it’s doable, but generally Item (folder) level permission is not a great idea as it’s less intuitive and creates management overhead.

Multiple document libraries

Scattered across multiple Document Libraries and set up unique permissions for each?

Do you understand the difference between the two approaches? A few key things, with multiple document libraries, you can easily set up the dedicated content type for each of them which helps on search, also you can create dedicated list view for each library to present the data.

What if your announcement contains rich content?


Further to the above, what happens if the training materials you have released come with pre-created videos, documents and images? Would a few separate SharePoint pages and libraries suffice? It may not be a good idea. Too many document libraries in a same site can make it hard to navigate and find the right information.

Does a sub site sound like a better choice? You would have to have a dedicated place to manage all the relevant materials, which is much easier for users to find all the relevant materials and also it makes it easier for site owners to manage the access to the data.

Enterprise Social vs Discussion Boards


It is also important to consider, what happens to your data if you also require some feedback around your announcement and engage your workforce around the topic.

You have a few options for this. SharePoint itself has an out of the box Discussion Board feature, that could work well for this purpose. Data is stored within SharePoint, although it will require some customisation to help SharePoint search crawl and understand the data.

But what about the wider Office 365 product stack? If your organisation is looking to move to Office 365 in the near future, Yammer could provide the perfect platform for engaging your workforce and creating an environment that actively encourages wider discussions throughout your organisation.   Data is stored in the cloud, and there are lots of statistic features and out of box search to help manage the data in Yammer.

It’s always good to keep your options open with data management


It’s always good to have more options. However, without sufficient understanding and planning, this could also work against the productivity. But, how do you know if your data are managed correctly and reached out to the max number of your end users? Remember the easiest option might not be the best option. A health check can help you determine how lost productivity time could have been impacted by poor data management.

Secure your SharePoint future

Learn the key steps to completing a successful SharePoint migration


About our guest author

Michael Wang

Technical Account Manager | Content and Code

Michael leads the DevOps team and is responsible for our Development On-Demand service prominently for SharePoint in Office 365, or on-premises. Michael has worked on a vast number of DevOps projects for clients large and small, often with complex and dynamic requirements.


What is IT Service Management?

IT Service Management (ITSM) refers to the entirety of activities – directed by policies, organised and structured in processes and supporting procedures – that are performed by an organisation to plan, design, deliver, operate, and control IT and Cloud services, such as Office 365, to end-users.

IT Service Management Challenges

  • Increased Business Scrutiny: the need for IT cost transparency and business-value demonstration
  • Increased business (and user) expectations: around agility, availability, support and end-user service management
  • Increased business and IT complexity: particularly, hybrid cloud, mobility, and compliance

In the cloud environment, there is a clear focus shift from IT managing the infrastructure components to understanding what IT achieves rather than what it does. This comes in many different guises:

“How can we better serve our users?”

“How can we better demonstrate the value IT delivers?”

“How do we evolve into an IT organisation that’s fit for operating in the cloud and the changing business needs and expectations?”

“How do we improve IT support based on actual business needs?”

The Role of IT after migrating to the cloud

Moving to the cloud fundamentally changes the paradigm of what IT are doing within any given organisation. Rather than managing the infrastructure, managing servers, deploying patches and upgrading software, Office 365 enables IT department to focus on what is the business value and benefit of this technology to their organisation.

New and modern ways of working have meant that IT Teams must now be able to manage a diverse range of devices, help employees connect from anywhere, and provide quick access to applications and information – while maintaining stringent security measures and ensuring they are staying compliant at all times.

The cloud allows IT departments to take the lead by bringing new services and applications to your organisation before the business realises that they need them.

With evergreen cloud productivity platforms such as Office 365, IT is now positioned to deliver a consistent and optimised user experience across multiple platforms and devices. Now you can shift the perception that IT is a ‘blocker’ to one that is a strategic partner vital to business success.

IT organisations are now looking ahead

Forward-thinking organisations are now looking at how to optimise their cloud investments and solve business challenges with their new infrastructure capabilities.

Moving to cloud based applications has now seen a shift towards bringing IT closer to business units and end-users. This also translates into an individual skill set change for employees, who shift from needing deep technical expertise to developing higher-level strategies, improving business acumen, and becoming more customer-centric in providing IT services.

The operational role of IT has shifted to a mode of empowerment and enablement. Operationally this means keeping a finger on the evergreen pulse of the cloud services and supporting customers in understanding the capability and appropriate use cases for different solutions.

Moreover, the organisational structure of IT has changed to align more with the continuous cycle of design, build, finance and operations. Ultimately, you are building the foundations that enable your organisation to reach for the clouds.

Evergreen Management

Microsoft define evergreen management as:

“The act of managing continuous evolution of features and functionality in the cloud to achieve business benefit while avoiding any adverse side effects.”

What focus points does this introduce?

For IT Pros:

  • Business Value programs
  • Evolution of Operations
  • Modernising the Service Desk

For IT Management:

  • Changing the IT organisation to support Office 365

For the IT Organisation:

  • Plan how each role group will evolve with respect to Office 365 workloads

So, what tools are available to IT Teams to manage Office 365?

Office 365 Content Adoption Pack

How do you drive most value from your cloud services for your organisation? The first step is to understand how your users are using Office 365 and probably more importantly is understanding which of your end users are not using a specific Office 365 service.

This allows you to be targeted in your end-user communication and training. To help you with that, the Office 365 Admin centre comes with a set of rich usage reports that help you understand how your users are using and adopting Office 365. Usage reporting allows you to drill down to specific users to understand their interaction with specific Office 365 services.

Usage reports are a good start, but if you want to pivot and analyse the information, Microsoft provide a Content Adoption Pack in Power BI. This combines all the data insights from the Office 365 usage reports with the analytic capabilities of Power BI.

The dashboard is split into four areas:

  • Adoption
  • Communication
  • Collaboration
  • Activation

Each area gives you more specific usage insights. Clicking on each area gives you access to underlying reports and information. The adoption dashboard combines the usage information with your Active Directory information which allows you to pivot the information by department, team, location.

The information displayed in these dashboards is also relevant to people outside of the IT organisation. You can provide access to these dashboard by sharing the dashboards with those key stakeholders.

Each organisation has specific requirements around data analysis, so Microsoft have provided the capability to customise the reports.

Office 365 Message Centre

Office 365 is an evergreen service that makes new features and services available to end users and admins on a regular basis. So how do you stay up to-date.

For an IT department, it is important to understand what new features are coming to their environment so that they can inform their end-users about them. The Office 365 Message Centre is one of the main channels that Microsoft use to inform you of updates that they are delivering:

  • Notifications of updated features
  • New features that are coming to Office 365
  • Inform IT support teams about issues and any action that need to be taken, for example on your on-premises environment

The information provides details about the new or changed functionality, what you need to do and when the changes will start rolling out.

Security and Compliance

Within the Office 365 Admin Centre there is a separate Security and Compliance centre to provide you with a one stop shop to manage all security and compliance activities. Functionality for this includes:

  • Full audit log searching
  • Advanced security management
  • Alert policy configuration

Service Incidents

The Admin Centre provides information via the Service Health dashboard on service incidents, you want to know:

  • How many of your users are impacted?
  • How long will it take before the service incident is resolved?
  • Is there a work around you can implement?

The Service Health Dashboard will provide you with this information and personalised information about that specific service incident.

To help you identify high impact issues, they are separated out into two categories, ‘Incidents’ and ‘Advisories’.

Transform your IT Service Management and deliver ROI in the cloud

Sign up for our FREE morning IT Service Management roundtable session.


About our author

David Francis

David Francis

Integrated Services Director | Content and Code

Dave is responsible for delivering Content and Code’s Office 365 Service Introduction engagement to clients. Content and Code have long recognised the importance of embedding Service Introduction activities into every Project Lifecycle and Service Design activities. Content and Code understand the transformation needed by clients to handle the constant service changes in an “evergreen environment”.


Share This


Share this piece of content on your own network!