Mar 272018
 

As stated in my previous post you need to sync different objects between servers when using an Always On Availability Group. One of those things was SQL Agent Jobs. Once the jobs are synced to all of the servers you need to take another step. That step is to ensure that jobs are running only once so that you don’t have multiple copies either all sucking down resources or even worse… overwriting each other.

I have had to write similar code multiple times at multiple clients and decided to take what I have learned from those (both successes and failures) and build a new version completely outside of current client work so that I can share it with all of you.

Goals

  • Only have jobs run on the primary server when it is in an Availability Group
  • Have the ability to have a job run on a server that does not have an Availability Group setup
    • This is needed if one of your environments does not have an Availability Group (IE Dev/CI/Test etc) or if you are intentionally taking the databases out of the Availability Group.
    • Also important when keeping the jobs looking the same in every environment. In previous versions I had built people forgot to add the steps when it reached PD which caused unexpected errors.
  • Easy to add or remove the job steps
  • Easy way to verify that the job steps exist and have proper configuration

What the process does

Overall Idea

The overall idea is that every time a job runs you need to get to one of 3 states:

  1. The Job runs normally
  2. The Job does not run and complete with a success status
  3. The Job does not run and ends in a failure status when there is a problem

In order to accomplish all 3 states we add 2 SQL Agent Job steps. The reason we add 2 is because SQL Agent Job Steps only have 2 states when exiting the step. Success or Failure.

For this example we have a simple 1 step SQL Agent Job

Original

Step 1
- Run SSIS Package

After the stored procedure to add the AGRun Steps has been executed

Step 1
- On Success go to Step 3
- On Failure go to Step 2
Step 2
- On Success report Success [2. The Job does not run and complete with a success status]
- On Failure report Failure [3. The Job does not run and ends in a failure status when there is a problem]
Step 3 (Former step 1)
- Run SSIS Package [1. The Job runs normally]

Step 2

Step 2 does a number of different checks just to make sure everything is setup correctly.

  1. Does the Configuration exist?
  2. Does the Driver Database exist on this server?
    • Drive Database will be described more in setup section
  3. Also does a check to ensure the SP exists (by failing if it doesn’t)
    • This can occur when you first copy the SQL Agent Job to a new server

Step 1

Step 1 does Everything that Step 2 does plus 2 additional Steps.

  1. It does a check to see if the driver database is in the availability group.
  2. It decides if it should run.
    • If the driver database is the primary in the availability group OR if it is not in an AG and has the bit in the config set to 1 then it completes the step successfully which has the rest of the job run normally.
    • Else It will throw an error so that it goes into Step 2.

Setup

Installation

You can install these objects in any database that is not part of the availability group. I would recommend putting them in a database setup to just hold DBA objects instead of a system or separate application database.

New Objects – Scripts found on github

  • SQLJob – Schema
  • SQLJob.AGRunAddSteps – Stored Procedure
  • SQLJob.AGRunDropSteps – Stored Procedure
  • SQLJob.AGRunStatusCheck – Stored Procedure
  • SQLJob.AGRunConfig – Table
  • SQLJob.AGRunJobList – Table Type
  • SQLJob.AGRunReviewer – View

You will need to populate the AGRunConfig table after you have created it.

AGRunConfig has 3 required columns:

  • ConfigName
    • Name for the config – This will go in the Added Job Steps so that they know which configuration to use
  • DriverDatabase
    • This is the database that you will be watching to see if it is in the availability group or not
  • RunIfNotInAG
    • This is used to control if the job will run or not when it is not in an availability group

Adding the Steps to Jobs

This Adds the AGRun Steps to the list of SQLAgentJobs provided.

  • If the SQL Agent Jobs passed in as a parameter already has the AGRun steps, it will drop and re-add them.
  • This is to allow you to change the Config name easily and allow for re-running of the scripts without error

IsDebug – (Optional Parameter)

  • if NULL or 0 is passed the stored procedure will add steps
  • if 1 is passed the stored procedure will show a before and after of the job steps and then roll back

Example:

DECLARE @SQLAgentJobs SQLJob.AGRunJobList

INSERT INTO @SQLAgentJobs (
   SQLAgentJobName
   , AGRunConfigName
)
VALUES
   ( 'AGDemo - Simple Job' , 'AGDemo' )

EXECUTE SQLJob.AGRunAddSteps
   @SQLAgentJobs = @SQLAgentJobs -- AGRunJobList
   , @IsDebug = 1 -- bit

Removing the Steps from a Job

This removes the job steps if they exist and sets the start step to the appropriate step. It was built presuming that you would not want to do this in bulk so it only takes in 1 job name at a time.

Example:

EXEC SQLJob.AGRunDropSteps
   @SQLAgentJobName = 'AGDemo - Simple Job'

Verification

The view SQLJob.AGRunReviewer is here to give you a general idea of if the AGRun steps have been added and if they have been configured correctly.

The AGRunStatus column has 3 types of statuses

  1. No AGRun Steps Setup
  2. AGRun Steps Configured
  3. Error-(Error Description)

Does error checks for

  1. If the AGRun steps have been bypassed
  2. If the config is currently missing
  3. If the Step1 and Step2 are not using the same config name

Conclusion

Hopefully you will find these steps helpful in getting your Availability Group setup and running correctly.

If you find any bugs please add them as a github issue so that we can get them fixed for everyone.

 Posted by at 9:34 pm
Mar 202017
 

When you setup availability groups you will need to manually sync things like SQL Agent Jobs, logins, as well as additional meta data (More notes here: https://msdn.microsoft.com/en-us/library/hh270282.aspx). I was recently tasked with automating the syncing of the logins so that if we add or remove a login on the primary node, it will flow through to other related nodes. Thankfully the company I did the work for was generous enough to let me post the code and the process.

To start, you’ll need a way for the severs to talk. We already had linked servers setup between the nodes but this process could have been easily replicated by using a SSIS package or even something else.

Second, you’ll have to decide if you are going to use a merging process or a 1-way process. I decided to do a 1-way process as we only have one availability group setup on the servers. If you have multiple availability groups setup on the same servers, then you will need to tweak this process so that it works for you.

I built the following SP and then have a SQL Agent Job run this on all of the secondary nodes every day.

Below is a description of what the stored procedure does. You can find the script in its entirety on my websites github repo.

  • Parameters
    • Linked Server Name – The name of your primary server
    • Commit Changes – Bit – Allow you to run the SP without actually changing anything on your server. Very helpful when setting up a new server.
    • Debug Level – This will become clearer as we go through the whole process
      • Level 0 – No debug output
      • Level 1 – Only the work to do table
      • Level 2 – Most verbose level of data
  • Parameter verification
    • Make sure that the linked server exists and the linked server is the primary for the availably group in addition to other checks
  • Get data from the local and linked servers that the stored procedure will use for comparison
    • Login Info
      • SID
      • LoginName
      • LoginType
      • IsDisabled
      • Please review stored procedure for full list
    • Role Membership
    • Login Permissions
  • Determine which logins to ignore
    • Local logins to each server
    • System logins
    • sa
    • Anything in a new table that you create as part of setting this up called dbo.LoginsNotToSync. These can be SQL Logins, Windows Logins, or Windows Groups
    • If you wanted to, you could also add manual checks in here as well
      • Maybe your DBA logins match a certain format
  • Figure out if a SQL Login needs to be recreated due to the SIDs not matching
    • If you create a sql login manually on 2 servers it will create different SIDs and then when the servers flip between different members of the availability group you will have to fix orphaned users manually
    • As part of this process each of these users will need to be dropped from the secondary server and copied from the primary server
  • Drop/Create/Update logins
    • Drop
      • These are logins that only exist on the secondary server or need to be recreated due to SID
      • Saves the DROP LOGIN off into the Work To Do temp table
    • Create
    • Update
      • The code here is a little more complex but it finds and updates logins that need to have any of the following updates applied:
        • All Logins
          • Default Database
          • Disabled
          • Deny Connect
          • Does Connect need to be revoked
        • SQL Logins only
          • Password Hash
          • Checking if the Password Policy is used
          • Checking if the password is expired
      • Saves this work to the Work To Do temp table
  • Verify that the logins that will be dropped will not cause additional problems
    • Throw an error if any of the logins that will be dropped own a database
    • Throw an error if any of the logins that will be dropped own a SQL Agent Job
  • Sync the server level role membership
    • This process does not setup permissions for any server roles or create any new roles
    • It does make sure that the role membership is the same on both servers
    • Database level roles and permissions are handled manually for system databases or databases that are not in the availability group
    • Inside of the availability group database permissions move with the database
    • Saves off changes into the Work To Do temp table
  • Sync server level permissions
    • Does GRANTs, DENYs and REVOKEs for server level permissions
    • It specifically will not work if you are using GRANT WITH GRANT OPTIONs on permissions. We do not use this type of permission on these servers and did not want to write the code to handle it at this time.
    • Saves off to Work To Do temp table
  • Debugging
    • Level 1 – Output just the Work To Do so you can see what will be run if you commit the changes
    • Level 2 – Outputs the Work To Do as well but also includes:
      • Logins that are being skipped
      • SQL Logins where SIDs don’t match
      • Logins to Create
      • Logins to Drop
      • Base Login Information
      • Server Role Membership
      • Server Permissions
  • Commit
    • If you decide to commit the changes it will run through all of the Work To Do as well as log all of the work that is being done to a table
    • It does this part inside of a transaction as well just to be safe
 Posted by at 11:04 pm
May 282015
 

How many times have you had it when building a script and you build a normal INSERT statement using VALUES

DECLARE @SomeRecords TABLE (SomeRecordName VARCHAR(100))

INSERT INTO @SomeRecords (SomeRecordName)
VALUES
	('Record1')
	,('Record2')

Then suddenly you need to do some calculation for a new column that is needed

DECLARE @SomeRecords TABLE (
	SomeRecordName VARCHAR(100)
	,SomeNeededColumn int
)

Normally you would need to either do it in 2 steps with an INSERT then an UPDATE

INSERT INTO @SomeRecords (SomeRecordName)
VALUES
	('Record1')
	,('Record2')

UPDATE @SomeRecords
SET SomeNeededColumn = CASE WHEN /*some calculation*/ THEN 1 ELSE 0 END

Or rewrite it to use SELECT and UNION/UNION ALL

INSERT INTO @SomeRecords (SomeRecordName,SomeNeededColumn)
SELECT SomeRecords.SomeRecordName
	,CASE WHEN /*some calculation*/ THEN 1 ELSE 0 END AS SomeNeededColumn
FROM (
	SELECT 'Record1' as SomeRecordName UNION
	SELECT 'Record2' 
	) SomeRecords

But, you can actually do this with almost no rework at all by putting it in parentheses and adding a (ColumnName) after the normal table alias

INSERT INTO @SomeRecords (SomeRecordName,SomeNeededColumn)
SELECT SomeRecords.SomeRecordName
	,CASE WHEN /*some calculation*/ THEN 1 ELSE 0 END AS SomeNeededColumn
FROM (
	VALUES ('Record1')
		,('Record2')
	) SomeRecords(SomeRecordName)
 Posted by at 9:15 pm
May 032014
 

I recently spent a while at work cleaning up the permissions on a database. Some of the steps followed:

  • All users are now assigned to a role
  • Appropriate permissions were given to the roles
  • All permissions (except for connect) were removed from the users

Now that I spent all of the time and energy getting the roles setup and getting the development team to agree to and understand how and why we wanted everyone assigned to a role, I wanted to ensure that no future permissions get given directly to users. To that end I wrote this trigger to revert any permissions that someone may try to give. It is still easy to turn off but I wanted to give whomever was assigning the rights another chance to reconsider if this is what they really wanted to do.


CREATE TRIGGER PermissionTrigger ON DATABASE
FOR GRANT_DATABASE AS
BEGIN
--Does a check if the permissions are being assigned to a user (Both SQL and Windows)
IF EXISTS (
SELECT 1
FROM (
--List of Grantee's from the EventData
SELECT Grantee.value('(text())[1]', 'nvarchar(128)') AS PrincipalName
FROM (SELECT EVENTDATA() AS XMLData) x
CROSS APPLY x.XMLData.nodes('//EVENT_INSTANCE/Grantees/Grantee') AS Grantees(Grantee)
) G
JOIN sys.database_principals dp ON dp.NAME = g.PrincipalName
WHERE dp.type IN ('S','U') --SQL Users and Windows Users
)
BEGIN
RAISERROR ('You can not assign permissions to a user. If you think it is in error please contact a DBA',10,1)
ROLLBACK
END
END

Edit: 2017-03-20 script is now on github

 Posted by at 11:25 pm
Sep 132013
 

Initial Symptoms

We got a ticket from a customer where searches were not working and when we executed the SP we got the following error

Msg 30049, Level 17, State 10, Procedure <SP Name>, Line 126
Fulltext thesaurus internal error (HRESULT = '0x8007054e')

Starting Steps

Looked at the SP and got which full text index it was trying to use.

We tried rebuilding the full text index. That didnt work.

We tried rebuilding the Full Text Catalogs… This actually caused SQL Server Management Studio (SSMS) to crash and restart.

We also tried to delete and completely re-add the Full Text Index.

We were able to figure out how to manually reload the Full Text Search Thesaurus for english by executing  EXEC sys.sp_fulltext_load_thesaurus_file 1033,0;
This worked fine in QA but failed where we were working on it. It gave us the error

Msg 208, Level 16, State 1, Procedure sp_fulltext_load_thesaurus_file, Line 60
Invalid object name 'tempdb.sys.fulltext_thesaurus_metadata_table'.
Msg 266, Level 16, State 2, Procedure sp_fulltext_load_thesaurus_file, Line 60
Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements. Previous count = 0, current count = 1.

Where the fun begins

This lead us to looking into tempdb. If you run the following query

SELECT name
FROM tempdb.sys.objects
WHERE type = 'IT'
AND name like ('FullText%')

You should see the following 3 tables:

  • fulltext_thesaurus_metadata_table
  • fulltext_thesaurus_phrase_table
  • fulltext_thesaurus_state_table

None of these tables were there. These are internal tables to tempdb and cannot be created by a user. But normally tempdb is fully recreated when you restart the sql server service.
After a restart of the service…… the tables are still not there.

We obviously did other troubleshooting like dbcc checkdb‘s on both the problem database and tempdb and they came back with no errors.

What finally fixed the root problem

We are not sure what caused the tables to go missing in the first place but they should have come back after the restart of the sql server service. The problem after the restart was that one of the other databases (Database C) on the server was stuck in recovery mode and apparently the thread that adds the tables to the tempdb is the same thread or was blocked by the thread that was trying to bring Database C out of recovery.

  1. To fix it we turned SQL Server back off.
  2. Renamed the mdf for Database C. (This caused the Recovery to be skipped.)
  3. Started SQL Server.
  4. Hurray the tempdb tables were back.
  5. Ran EXEC sys.sp_fulltext_load_thesaurus_file 1033,0; This should not be needed but wont hurt anything.
  6. Renamed the file for Database C back to what it was.
  7. Tried to access Database C from SSMS. (This caused it to look at the file again and put it back into recovery mode so it could finish.)

We were able to verify that Full Text Search was working again and were able to get Database C fixed.

Note: This was on SQL Server 2008 R2

 Posted by at 1:38 pm